Is this using something like https://github.com/creatale/node-fv on the backend, which can accommodate various not perfectly scanned forms to data, after you prepare a schema? Or is it a more simplistic "mark hotspots" which won't work well/at all if if it is not perfectly aligned/sized with the original?

We do position based text extraction. We add however an 'unpaper' function which tries to correct misalignments and increases the quality of the scan.

What OCR library do you use? What languages it supports?

For scanned images we use https://github.com/tesseract-ocr/tesseract. For text based PDFs we pull the text directly from the file and all languages are supported.