⚠ In case you've missed it, we have migrated to our new website, with a brand new forum. For more details about the migration you can read our blog post for website migration. This is an archived forum. ⚠

  •     

profile picture

Charset in Export to CSV



Paul Savostin
  • profile picture
  • Member

Posted 08 May 2014 - 14:28 PM

Hi guys! Need some help here.

So, I need export and import .csv files with GC and i faced with some troubles with charsets. Well, with export seems all good, just choose to output column that you need to export and

everything will be fine, default charset for exporting in GC is UTF-16LE (why so, i had read in some post on this forum for some reason that is the best choice for Window Excel)

But to save in DB i need utf-8 charset (my mysql db created with it). In common task follow -> just export from GC, then edit in Excel (or even just create there) and then for example, import back

 

to DB . So next questions:

1.)Is excel working with utf16le and create or save file in that charset?

2.)Or maybe change default GC export charset to UTF-8? How will that working with Excel?

Maybe just someone face with same problems and advise me what to do the best.

Workaround:

In texteditor like Notepad++ GC export UTF-16LE charset has name - UCS-2 Little Endian ( Wiki says this is the same)

In my app after importing file I try to read string by string and decode from UTF-16LE to UTF-8 with iconv('UCS-2LE / UTF-16LE', 'UTF-8', $str)

 

or mb_convert_encoding($str,'utf-16','utf-8'), and then using staff like str_getcsv() or similar

 

result is awful -> ᨀ᠄ ऄ℀㸄䐄㠄㤄䄄㨄〄伄ऄ㄀ᨀഄ਀

 

or

  [?? 09>=] =>  
            [#;8F0] => !>D89A:0O
            [1J5:B] => 1

And it works if I do nothing in import, just recorde file after export from GC with Notepad++, just set to UTF-8 without BOM, and then all works fine!

Any help and advise will come in handy, many thanks and have a nice day!


Paul Savostin
  • profile picture
  • Member

Posted 10 May 2014 - 01:14 AM

Resolved:

Excel normally takes files from GC export, than when in Excel save to ANSI charset and when I import data to GC, I just recode from ANSI (in my case this is CP1251/WINDOWS-1251) to UTF-8

and all works fine! Сonclusion
- it is Excel fault:)