Hi there,
Good to see you are trying to make use of GC to much better extent and use.
As for text extraction stuff is concerned - there are php libraries available to extract text out of the PDF - u can use them ... but mind it - they are way too shaggy.. cuz it will extract pure text with all line breaks and not formatted text. Atleast i have failed in getting better result with free libraries available.
As if u wana generate a image out of the PDF - u can use ImageMagik - it have a convert - utility .. that will help you convert your pdf into a jpg. There are open php libraries that are available to convert the pdf to image but the best part is using a imagemagik utility rather then library.. its my personal experience m sharing with you.
Once you have converted, now if you wana show preview for the same - what you need to do is play around with the code in GC Library to render the thumbnail of pdf and show the preview image instead of PDF link
One of the function such as - get_upload_file_input
This is just 1 function - u need to find - all possible places it uses '.jpg' - where it uses that - to render - u can set the PDF too .. when pdf - use converted image rather then the pdf link.
As for your requirement of myltiple text boxes - u can achieve it by using the same / similar javascript to achieve the functionality. One thing u need to understand / remember is to use callback_before_insert OR callback_insert to handle / manage such additional inputs. If you don't, GC will try to create a new record with all the fields it finds. If it is not able to find the relevant field name in the table - it will thrw error.
Hope the solution shared works for you.
Happy GCing :)