Im using #JsonIgnoreProperties({"id"}) on a subclass in spring, this seems to stop the data from being output. But i still get the "Id" header on the first line
example:
Id, Name, Address
,xyz,ayaya
How can i stop the header from containing Id, for those columns that are ignored.
Related
I am fumbling around with Pandas (I want to avoid using Excel, I have very basic knowledge of Pandas and a reasonable of Python), trying to add a column based on another column.
Specifically, I have a column with IDs, and I want to enrich my data by making a HTTP query to an API and using a field in the JSON response:
d['m0'] = pd.read_json(f"http://localhost:3000/{d['id']}")['H']['M0']
What I wanted to say in the above was
take the data from a cell in the column id, run the API query, and put the ['H']['M0'] of the JSON response (a string) into the column m0
What I get is
InvalidURL Traceback (most recent call last)
Cell In[5], line 1
----> 1 d['m0'] = pd.read_json(f"http://localhost:3000/{d['id']}")['H']['M0']
I feel that the way th eURI was built is not correct, i.e. the content of the cell for column id was not used, but rather the whole column:
InvalidURL: URL can't contain control characters. '/0 AA13\n1 BB10\n2
AA13, BB10, ...are the ids in the column
If I understand it correctly, you have various ID values and you want to automate it by fetching the JSON corresponding to that ID and use values from JSON to further populate the dataframe.
I have not seen the structure of your JSON or return type of accessed fields; but I feel that you are looking for following:
d['m0'] = d['id'].apply(lambda id: pd.read_json(f"http://localhost:3000/{id}")['H']['M0'])
I have a question regarding the templating option for XML in Open Refine. Is it possible to export data from two columns in a nested XML-structure, if both columns contain multiple values, that need to be split first?
Here's an example to illustrate better what I mean. My columns look like this:
Column1
Column2
https://d-nb.info/gnd/119119110;https://d-nb.info/gnd/118529889
Grützner, Eduard von;Elisabeth II., Großbritannien, Königin
https://d-nb.info/gnd/1037554086;https://d-nb.info/gnd/1245873660
Müller, Jakob;Meier, Anina
Each value separated by semicolon in Column1 has a corresponding value in Column2 in the right order and my desired output would look like this:
<rootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/1037554086">
<skos:prefLabel xml:lang="zxx">Müller, Jakob</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/1245873660">
<skos:prefLabel xml:lang="zxx">Meier, Anina</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<rootElement>
(note: in my initial posting, the position of the root element was not indicated and it looked like this:
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
)
I managed to split the values separated by ";" for both columns like this
{{forEach(cells["Column1"].value.split(";"),v,"<edm:Agent rdf:about=\""+v+"\">"+"\n"+"</edm:Agent>")}}
{{forEach(cells["Column2"].value.split(";"),v,"<skos:prefLabel xml:lang=\"zxx\">"+v+"</skos:prefLabel>")}}
but I can't find out how to nest the splitted skos:prefLabel into the edm:Agent element. Is that even possible? If not, I would work with seperate columns or another workaround, but I wanted to make sure, if there's a more direct way before.
Thank you!
Kristina
I am going to expand the answer from RolfBly using the Templating Exporter from OpenRefine.
I do have the following assumptions:
There is some other column left of Column1 acting as record identifying column (see first screenshot).
The columns actually have some proper names
The columns URI and Name are the only columns with multiple values. Otherwise we might produce empty XML elements with the following recipe.
We will use the information about records available via GREL to determine whether to write a <recordRootElement> or not.
Recipe:
Split first Name and then URI on the separator ";" via "Edit cells" => "Split multi-valued cells".
Go to "Export" => "Templating..."
In the prefix field use the value
<?xml version="1.0" encoding="utf-8"?>
<rootElement>
Please note that I skipped the namespace imports for edm, skos, rdf and xml.
In the row template field use the value:
{{if(row.index - row.record.fromRowIndex == 0, '<recordRootElement>', '')}}
<edm:Agent rdf:about="{{escape(cells['URI'].value, 'xml')}}">
<skos:prefLabel xml:lang="zxx">{{escape(cells['Name'].value, 'xml')}}</skos:prefLabel>
</edm:Agent>
{{if(row.index - row.record.fromRowIndex == row.record.rowCount - 1, '</recordRootElement>', '')}}
The row separator field should just contain a linebreak.
In the suffix field use the value:
</rootElement>
Disclaimer: If you're keen on using only OpenRefine, this won't be the answer you were hoping for. There may be ways in OR that I don't know of. That said, here's how I would do it.
Edit The trick is to keep URL and literal side by side on one line. b2m's answer below does just that: go from right to left splitting, not from left to right. You can then skip steps 2 and 3, to get the result in the image.
split each column into 2 columns by separator ;. You'll get 4 columns, 1 and 3 belong together, and 2 and 4 belong together. I'm assuming this will be the case consistently in your data.
export 1 and 3 to a file, and export 2 and 4 to another file, of any convenient format, using the custom tabular exporter.
concatenate those two files into one single file using an editor (I use Notepad++), or any other method you may prefer. Several ways to Rome here. Result in OR would be something like this.
You then have all sorts of options to put text strings in front, between and after your two columns.
In OR, you could use transform on column URL to build your XML using the below code
(note the \n for newline, that's probably just a line feed, you may want to use \r\n for carriage return + line feed if you're using Windows).
'<edm:Agent rdf:about="' + value + '">\n<skos:prefLabel xml:lang="zxx">' + cells.Name.value + '</skos:prefLabel>\n</edm:Agent>'
to get your XML in one column, like so
which you can then export using the custom tabular exporter again. Or instead you could use Add column based on this column in a similar manner, if you want to retain your URL column.
You could even do this in the editor without re-importing the file back into OR, but that's beyond the scope of this answer.
My current table in BigQuery has a column that uses complex types. The "family" column is actually a list ("repeated" feature) of records (with 2 fields: id & name).
When I try to get the 1st "id" value of 1 row with the following syntax:
FieldValueList c = qr.getValues().iterator().next();
c.get("family").getRepeatedValue().get(0).getRecordValue().get("id");
I get the exception:
Method threw 'java.lang.UnsupportedOperationException' exception.
Retrieving field value by name is not supported when there is no fields schema provided
This is a bit annoying because my table has a clearly defined schema. And when I do the "read" query with the same Java call, I can also see that this schema is correctly found:
qr.getSchema().getFields().get("family").getSubFields().toString();
-->
[Field{name=id, type=INTEGER, mode=NULLABLE, description=null}, Field{name=name, type=STRING, mode=NULLABLE, description=null}]
Due to this exception, the workaround that I have found is to pass the "index" of the record field instead of giving it its name
c.get("family").getRepeatedValue().get(0).getRecordValue().get(0).getLongValue();
However, this seeks awkward to pass an index instead of a name.
Is there a better way to get the value of a field in a record inside an array (if my column is only a record, without array, then I don't get the exception) ?
Is this exception normal?
You can wrap the unnamed FieldValueList with a named one using the "of" static method:
FieldList subSchema = qr.getSchema().getFields().get("family").getSubFields();
FieldValueList c = qr.getValues().iterator().next();
FieldValueList.of(
c.get("family").getRepeatedValue().get(0).getRecordValue(),
subSchema).get("id");
The "of" method takes a FieldValueList (returned by getRecordValue() in this case) and a FieldList (subSchema here), and returns the same FieldValueList but with named access.
I have two files.
file A has, 3 columns
Sno,name,age,key,checkvalue
file B has 3 columns
Sno,title,age
I want to merge these two into final file C which has
Sno,name,age,key,checkvalue
I tried renaming "title" to "name" and then I used "Add constants" to add the other two field.
but, when i try to merge these, I get the below error
"
The name of field number 3 is not the same as in the first row received: you're mixing rows with different layout. Field [age String] does not have the same name as field [age String].
"
How to solve this issue.
After getting the input from file B. You use a select values and remove title column. Then you use a Add constants step and add new columns name,key,checkvalue and make Set empty string? to Y. Finally do the join accordingly. So it won't fail since both the files have same number of columns. Hope this helps.
Actually, there was an issue with the fields ... field mismatch. I used "Rename" option and it got fixed.
I have a field in the XML file, categ_id. I need to access the value of that field in my Python code, in product_template class. I tried vals as a paremeter but it did not work.
If you can give me an example object.field_name as it relates to the case I have described.
Nebojsa - your question is not understandable at all, but I'll try to answer it. You can get the value of categ_id in two or even three ways:
vals.get('categ_id') - this is the way to go when you are creating a new record or updating existing one with change in categ_id field - otherwise you'll get an error or NoneType defined.
template = self.pool.get('product.template).browse(cr, uid, ids) and then template.categ_id.id - to get the value when you do have an id of the record, so you can ask database of value stored or in transaction, if there were any changes.
third opition is the dirtiest one, because it is just cr.execute("SELECT categ_id FROM product_template WHERE id = %s", (ids[0],)) and then category_id = cr.fetchall() - it is not always good option to use that, as it asks for records already existing in database (not counting these in transaction)