4B not good enough for image transacription + translation in a single query?

#1
by metalglot - opened

I've got this working for image processing, but with low quantization levels 4B returns nonsense, but at a high quantitization is just returns english.

Where as 12B/27B returns the transcribed text fully translated.

Has anyone else run into this issue?

Think the workaround for 4B not transcribing + translating in one query is to separate it into two queries? One to transcribe the contents of the image, then a second to translate the transcription?

metalglot changed discussion title from 4B not good enough for transacription + translation in a single query? to 4B not good enough for image transacription + translation in a single query?

Try it. Transcribing plus translating seams a bit much all at once for a 4B model. Generally 4B might simply be too small for such complex tasks.

I have tried it - it's just transcribing if it is a high enough quant, lower quant it returns nonsense. IMO it needs a transcription pass, then a separate translation query, rather than both in one query..

Sign up or log in to comment