Quantcast
Channel: SCN: Message List - SAP HANA Developer Center
Viewing all articles
Browse latest Browse all 9165

Re: Problem with IMPORT FROM when file used SOH character as field delimiter - created using AWS HIVE

$
0
0

Hi Aron,

    Thanks for your reply - I was going to try the external table approach as you suggested with the S3: bucket for the file location. Right now, I'm ok with the SED approach - it took about 26 seconds to rip through 256 MB of Hive data to replace the SOH with '|' characters for 5.5 million rows.

    HANA One imported the resulting pipe delimited file in 49 seconds - not bad. All told - I now have 523 MB loaded up in my column table with 11.7 million rows taking up 203 MB of RAM for a approximately 2.6 times compression in memory compared to on disk.

   The problem that I have with SQOOP based solutions with AWS is that it's expensive to keep the cluster up and running with the Hive database. In my current solution, my data is sitting on S3 storage and I spin up a 6 node cluster when I need to do some "quick" queries as I can't load all the data into HANA One at one time. I then pump out the results to S3 storage based on my Hive query and then tear down the cluster as soon as I check to make sure the output is as expected.

   However, what is needed is the IMPORT FROM statement to support non-printable characters as field delimiters for the best flexibility.

Thanks,

Bill


Viewing all articles
Browse latest Browse all 9165

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>