Quantcast
Channel: SCN: Message List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 9165

Re: Problem with IMPORT FROM when file used SOH character as field delimiter - created using AWS HIVE

$
0
0

FWIW - I used the following sed command to replace the SOH characters with commas

sed "s/\x01/,/g" 000000 > 000000.csv

I have two challenges with this approach:

  1. This introduces another process into my "Big Data" flow using AWS HIVE
  2. All my records don't load! In the file, there are 30,000 rows and only 2000 load. To make matters worse - it's not the first 2000 characters that were loaded.

Needless to say '\x01' didn't work directly in the IMPORT FROM statement.

The reason that I didn't get the expected result is that one of my data values every now and then had a comma in the name.  The good news is that I could use the '|' pipe character - as there were none in the source file.

 

So, the updated sed command achieved the desired result:

sed "s/\x01/,/g" 000000 > 000000.csv

 

I then changed my IMPORT FROM command to the following:

IMPORTFROM CSV FILE '/wiki-data/year=2013/month=04/000000.csv'

INTO"WIKIPEDIA"."pagecounts"

WITH RECORD DELIMITED BY'\n'

FIELD DELIMITED BY'|';

 

As a result, I got all 30,000 records. I'd still like to be able to process the file directly - any help would be appreciated. At least I'm not blocked for now.

Regards,

Bill


Viewing all articles
Browse latest Browse all 9165

Trending Articles