Dealing with commas in csv files csv-river plugin

Dealing with commas in csv files csv-river plugin - indexing

I am trying to index data present in csv file to elasticsearch server. The problem is the string itself contain multiple "," so during indexing it is giving indexoutofbound exception.
How to handle commas using csv-river plugin.
Edit:
The example file would be:
MESSAGE_ID,PARENT_MESSAGE_ID,THREAD_ID,FORUM_ID,FORUMINDEX,USER_ID,SUBJECT,BODY,MODVALUE,FORUM_NAME,CATEGORY_NAME,LIKES,DISLIKES,IS_ROOT_MESSAGE,IS_QUESTION
244,195,103,4,3,341,Re: The most stupidest program I've ever seen--Amazon,"I know nothing of your case, but I do know that throwing around terms like ""stupid idiot"" doesn't exactly help your side any.",1,"Order Management, Shipping, Feedback & Returns",Sell on Amazon,,,no,no

you need to enclose your fields in quotes. If the field contains a quote, you need to escape it with a preceding quote.
For example:
"field1","field2","field3 with, commas","field4","field ""5"" with quotes","field6"

Related

Multi-line text in a .env file

In vue, is there a way to have a value span multiple lines in an .env file. Ex:
Instead of:
someValue=[{"someValue":"Here is a really really long piece which should be split into multiple lines"}]
I want to do something like:
someValue=`[{"someValue":"Here is a really
really long piece which
should be split into multiple lines"}]`
Doing the latter gives me a JSON parsing error if I try to do JSON.parse(someValue) in my code

I don't know if this will work, but I can't format a comment appropriately enough to get the point across so see if this will work:
someValue=[{"someValue":"Here is a really\
really long piece which\
should be split into multiple lines"}]
Where "\" should escape the newline similar to how you can write long bash commands while escaping the newline. I'm not certain the .env interpreter will support it though.
EDIT
Looks like this won't work. This syntax was actually proposed, but I don't think it was incorporated. See motdotla/dotenv#333 (which is what Vue uses to parse .env).

Like #zero298 said, this isn't possible. Likely you could delimit the entry with a character that wouldn't show up normally in the text (^ is a good candidate), then parse it within the application using string.replace('^', '\n');

Extra quote marks being added to String field in dataframe

I'm trying to do some text processing on entries in a tsv file so I loaded it in as a dataframe and I'm trying to add a quotation mark at the beginning of a certain entry in the dataframe. So the code I'm using to do this is as follows
episode_info.loc[i, 'word'] = "\"" + episode_info.loc[i, "word"]
but the result I'm getting when I look at the output is """help" instead of just "help and the previous entry is just help so I don't know why this isn't working

Okay I printed out the entries in question to terminal and it looks like it was printing out the correct thing. I guess when I viewed it in Sublime, which is what I was using, the quotation marks were being formatted weirdly. Apologies for the unnecessary question.

Flat File Schema lines longer then expected

Hello there Stackoverflow, I've been tasked with making a flat file schema as well as a map, however, our specifications are that there are 3 fields,
----------
Name       Length
----------
TIdentity     2
OIdentity     17
Result        2
However, the file that we receive is 500(ish) characters long, is there a way to make it ignore the remaning empty characters??
Thanks for any help you guys might be able to supply

You should definitely ensure the spec and sample files are correct (particularly that the spec contains any whitespace requirements/options), but assuming they are and you're just supposed to ignore the whitespace, you can create node to stuff the whitespace into and just ignore it.
Without knowing a bit more about your requirements, it's hard to say exactly how this should work. If the whitespace is always a fixed length, make a node that expects that many characters. If it's not always a fixed length, you may have to make a repeating node that's one character long but not the record terminator (presumably CR/LF or something of the like). If the whitespace itself is the delimiter, you might be able to do something with the ignore_trailing_delimiter on the record.
Worst case scenario (whitespace is variable, you can't control the partner who sends it to you, and you can't get the FFDASM to sensibly deal with it), write a custom Decode component to preprocess the file and remove the extraneous whitespace.

Access VBA, importing csv file via TransferText with commata as decimal separator and semicolon as delimiter

I'm having some problems importing double numbers from csv files. The files have a semicolon delimiter and comma as decimal separator.
I can't set up import specs since the order of the fields in the csv often changes and it would be a desaster if the data goes into the wrong field.
Also the csv files will have to written to a temporary table first. Don't hate me for it, but since I have to process data and set some information fields for later data processing this is by far the easiest, fastest and safest way to achieve it.
Here is the problem itself:
When using TransferText it will import, but of course interpret the comma as delimiter. Not good ...
When replacing comma by full stop and semicolon by comma it works. But it will ignore full stops, so 1.2 becomes 12, 1.333 becomes 1333. The field will be of type double.
I've tests numerous things. Besides TransferText I've tried:
DoCmd.RunSQL ("INSERT INTO Tabelle1 SELECT cdbl(a1) as aa FROM[TEXT;FMT=Delimited;HDR=YES;CharacterSet=437;DATABASE=C:\SPOT].[test.csv]")
But nothing seems to work, even when I create a new table with field type DOUBLE before using TransferText ... decimals are still ignored.
So, I would be happy if you could tell me either how to use TransferText with or without replacing semicolon and comma in a first step or how to use the INSERT INTO stuff.
Thank you very much!

Ok, I think I got it!
The problem where the regional settings and that my Access uses comma as decimal separator. I was also not able to create a Import Spec via manual import, since it needs to have defined which fields will have to be imported.
What I did now was this:
Open the table MSysIMEXSpecsthat contains the import specs via query:
select * from MSysIMEXSpecs
Then add a new row and set SpecName = "Whatever", DecimalPoint= "," and 'FieldSeparator` = ";" and whatever other settings have to be made.
Since there is this workaround, isn't there a way to do this easier?

How to use escape character for a big string?

I have a big string, precisely - an XSLT code - that I would like to hardcode in my VB.net program. I tried with putting " before every quotation mark, but it still didn't work out, and it's pretty mocking to place it 100 times. Using Chr(34) is also not the best solution.
Is there some way, like to put # (or another character) before the string itself that will define and work for all the characters in the string that need to be escaped ?

If it is a large string. Why not save it to file and then read the file into memory before you want to use it. That way you don't have to do any escaping and it will be easy to modify if you decide to change it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Dealing with commas in csv files csv-river plugin - indexing

you need to enclose your fields in quotes. If the field contains a quote, you need to escape it with a preceding quote. For example: "field1","field2","field3 with, commas","field4","field ""5"" with quotes","field6"

Related

Multi-line text in a .env file

Extra quote marks being added to String field in dataframe

Flat File Schema lines longer then expected

Access VBA, importing csv file via TransferText with commata as decimal separator and semicolon as delimiter

How to use escape character for a big string?

Categories

Resources