Li Ming
SilverJim
AI & ML interests
None yet
Organizations
None yet
Failed to initialize the context: quantized V cache was requested, but this requires Flash Attention
2
#4 opened about 2 months ago
by
SilverJim
Is it work to quantize an existing LLM into 1.58bit?
1
#1 opened 2 months ago
by
SilverJim
SuperHOT for 7B model has been released & I need merge orca_mini_7B-GPTQ with SuperHOT
#2 opened over 2 years ago
by
SilverJim