How to sum data by in a category in Spotfire - sum

What would the custom expression be to sum data by a category, for each site.
Using the data below, I would like to Sum[X] for only values with category blue, for each site
What I have so far is Sum([X]) OVER [Site] --> Where / how do I put in the category qualifier?

the Intersect() function is a perfect fit here. it creates a hierarchy based on however many columns you list. more info in the documentation.
anyway, try the following:
Sum([X]) OVER (Intersect([Site], [Category]))
To do the same for only a single category, you can use an expression like
Sum(If([Category]="Blue",[X],0)) OVER ([Site])
This will leave a null/empty value when [X] is not "Blue" (case sensitive so beware!).
If you have multiple values, you can replace the condition with
If([X] in ("Blue", "Nurple", "Taupe"), ...)

what I found works best is: Sum(If([Category]="Blue",[X],0)) OVER ([Site])

Related

Open Refine: Exporting nested XML with templating

I have a question regarding the templating option for XML in Open Refine. Is it possible to export data from two columns in a nested XML-structure, if both columns contain multiple values, that need to be split first?
Here's an example to illustrate better what I mean. My columns look like this:
Column1
Column2
https://d-nb.info/gnd/119119110;https://d-nb.info/gnd/118529889
Grützner, Eduard von;Elisabeth II., Großbritannien, Königin
https://d-nb.info/gnd/1037554086;https://d-nb.info/gnd/1245873660
Müller, Jakob;Meier, Anina
Each value separated by semicolon in Column1 has a corresponding value in Column2 in the right order and my desired output would look like this:
<rootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/1037554086">
<skos:prefLabel xml:lang="zxx">Müller, Jakob</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/1245873660">
<skos:prefLabel xml:lang="zxx">Meier, Anina</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<rootElement>
(note: in my initial posting, the position of the root element was not indicated and it looked like this:
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
)
I managed to split the values separated by ";" for both columns like this
{{forEach(cells["Column1"].value.split(";"),v,"<edm:Agent rdf:about=\""+v+"\">"+"\n"+"</edm:Agent>")}}
{{forEach(cells["Column2"].value.split(";"),v,"<skos:prefLabel xml:lang=\"zxx\">"+v+"</skos:prefLabel>")}}
but I can't find out how to nest the splitted skos:prefLabel into the edm:Agent element. Is that even possible? If not, I would work with seperate columns or another workaround, but I wanted to make sure, if there's a more direct way before.
Thank you!
Kristina
I am going to expand the answer from RolfBly using the Templating Exporter from OpenRefine.
I do have the following assumptions:
There is some other column left of Column1 acting as record identifying column (see first screenshot).
The columns actually have some proper names
The columns URI and Name are the only columns with multiple values. Otherwise we might produce empty XML elements with the following recipe.
We will use the information about records available via GREL to determine whether to write a <recordRootElement> or not.
Recipe:
Split first Name and then URI on the separator ";" via "Edit cells" => "Split multi-valued cells".
Go to "Export" => "Templating..."
In the prefix field use the value
<?xml version="1.0" encoding="utf-8"?>
<rootElement>
Please note that I skipped the namespace imports for edm, skos, rdf and xml.
In the row template field use the value:
{{if(row.index - row.record.fromRowIndex == 0, '<recordRootElement>', '')}}
<edm:Agent rdf:about="{{escape(cells['URI'].value, 'xml')}}">
<skos:prefLabel xml:lang="zxx">{{escape(cells['Name'].value, 'xml')}}</skos:prefLabel>
</edm:Agent>
{{if(row.index - row.record.fromRowIndex == row.record.rowCount - 1, '</recordRootElement>', '')}}
The row separator field should just contain a linebreak.
In the suffix field use the value:
</rootElement>
Disclaimer: If you're keen on using only OpenRefine, this won't be the answer you were hoping for. There may be ways in OR that I don't know of. That said, here's how I would do it.
Edit The trick is to keep URL and literal side by side on one line. b2m's answer below does just that: go from right to left splitting, not from left to right. You can then skip steps 2 and 3, to get the result in the image.
split each column into 2 columns by separator ;. You'll get 4 columns, 1 and 3 belong together, and 2 and 4 belong together. I'm assuming this will be the case consistently in your data.
export 1 and 3 to a file, and export 2 and 4 to another file, of any convenient format, using the custom tabular exporter.
concatenate those two files into one single file using an editor (I use Notepad++), or any other method you may prefer. Several ways to Rome here. Result in OR would be something like this.
You then have all sorts of options to put text strings in front, between and after your two columns.
In OR, you could use transform on column URL to build your XML using the below code
(note the \n for newline, that's probably just a line feed, you may want to use \r\n for carriage return + line feed if you're using Windows).
'<edm:Agent rdf:about="' + value + '">\n<skos:prefLabel xml:lang="zxx">' + cells.Name.value + '</skos:prefLabel>\n</edm:Agent>'
to get your XML in one column, like so
which you can then export using the custom tabular exporter again. Or instead you could use Add column based on this column in a similar manner, if you want to retain your URL column.
You could even do this in the editor without re-importing the file back into OR, but that's beyond the scope of this answer.

Querying full and sub-strings via multi-valued parameter using SQL

I am building a report with Microsoft SSRS (2012) having a multi-value parameter #parCode for the user to filter for certain codes. This works perfectly fine. Generally, my query looks like this:
SELECT ...
FROM ...
WHERE
TblCode.Code IN (#Code)
ORDER BY...
The codes are of following type (just an excerpt):
C73.0
C73.1
...
C79.0
C79.1
C79.2
Now, in additon to filtering for multiple of these codes I would like to als be able to filter for sub-strings of the codes. Meaning, when the user enters (Example 1)
C79
for #parCodes The output should be
C79.0
C79.1
C79.2
So eventually the user should be able to enter (Example 2)
C73.0
C79
for #parCodes and the output would be
C73.0
C79.0
C79.1
C79.2
I managed to implement both functionalities seperately, so either filtering for multiple "complete" codes or filterting for sub-string of code, but not both simultaneously.
I tried to do something like
...
WHERE
TblCode.Code IN (#parCode +'%')
ORDER BY...
but this screws up the Example 2. On the other hand, if I try to work with LIKE or = instead of IN statement, then I won't be able to make the parameter multi-valued.
Does anyone have an idea how to realize such functionality or whether IN statement pared with multi-valued parameters simply doesn't allow for it?
Thank you very much!
Assuming you are using SQL server
WHERE (
TblCode.Code IN (#parCode)
OR
CASE
WHEN CHARINDEX('.', Code)>0 THEN LEFT(TblCode.Code, CHARINDEX('.', TblCode.Code)-1)
ELSE TblCode.Code
END IN (#parCode)
)
The first clause makes exact match so for your example matches C73.0
The second clause matches characters before the dot character so it would get values C79.0, C79.1, C79.2 etc
Warning: Filtering using expressions would invalidate the use of an index on TblCode.Code

Solr subquery merging issue

I have an issue to search with SOLR in following scenario,
I'd like to get all products within my favorite tag, categories and user. I want all products which created by my favorite user without any filter but products from favorite tag or categories must be filtered with in a selected location. I have tried as follows,
http://www.example.com:8983/solr/collection1/select?rows=10&start=0&wt=json&indent=true&sort=event_start_date asc&q=status:1 AND event_start_date:[2015-04-23T21:38:00Z TO *] AND ( tags:5539d77455061a650f96c67e OR category1_id:53d16fb28066a12606bbb5f2 OR category2_id:53d16fb28066a12606bbb5f2&fq={!geofilt d=40.2335}&pt=9.9312328,76.26730409999999&sfield=latlng) OR ( user_id:5465da1dc54d3c2a15000000 )
But its not working. Any body help me to find what's wrong with my query?
First of all you have a fq (filterquery clause) inside your query clause (check parenthesis) which is wrong.
fq={!geofilt d=40.2335}&pt=9.9312328,76.26730409999999&sfield=latlng
You can try things like puting the geofilt filter query OUTSIDE your main query with tests so it will be skipped if...
http://www.example.com:8983/solr/collection1/select?rows=10&start=0&wt=json&indent=true&sort=event_start_date asc&q=status:1 AND
event_start_date:[2015-04-23T21:38:00Z TO *] AND
(tags:5539d77455061a650f96c67e OR
category1_id:53d16fb28066a12606bbb5f2 OR
category2_id:53d16fb28066a12606bbb5f2) OR
(user_id:5465da1dc54d3c2a15000000)
&fq=user_id:5465da1dc54d3c2a15000000 OR
{!geofilt pt=9.9312328,76.26730409999999 sfield=latlng d=40.2335}
If user_id is 5465da1dc54d3c2a15000000 then the filterquery is already true so location part is skipped.

How to change values in facet to same in Google Refine?

I'm trying to clean this data: https://dl.dropbox.com/u/820037/local_council_election_data_w_occupation.gz
It's all the candidates for a local councils' election in Finland. In the column "Ammatti" there is the occupation of a candidate as reported by him/her.
I want to find all the students, but the problem is, that they can be "opiskelija" (student) or "yliopisto-opiskelija" (university student) and things like that.
I clicked the column title "Ammatti" and Filtered it with "opiskelija", then I created a "text facet" from the menu in column title.
That gives me the following facet:
agrol. opiskelija AMK 1
agrologiopiskelija 9
agronomiopiskelija 1
...and so on.
I'd want to change the value of "Ammatti" (occupation) to "opiskelija" (student) in everyone of these occasions.
To make thngs a bit more complicated the facet has also some occupations (mature students and administrative staff) I don't want to change to "opiskelija":
aikuisopiskelija 10
opiskelijakunnan hallituksen varapuheenjohtaja 1
opiskelijapalvelun päällikkö 1
opiskelijapalvelupäällikkö 1
I did this by hand clicking through the whole list in the facet and changing the occupations one by one.
I suppose there is a better way to do this, but could someone please tell me how I should've done it?
Using the 'include' option in the facet, select all the rows that you want to transform from the column "Ammatti". Then in for this column invoke the Transform function and replace "value" by "opiskelija"
This will replace all the value you have selected by "opiskelija".
Hope this help (and it doesn't come too late).

MOSS 2007: What is the source of "Directories"?

I'm trying to generate a new SharePoint list item directly using SQL server. What's stopping me is damn tp_DirName column. I have no ideas how to create this value.
Just for instance, I have selected all tasks from AllUserData, and there are possible values for the column: 'MySite/Lists/Task', 'Lists/Task' and even 'MySite/Lists/List2'.
MySite is the FullUrl value from Webs table. I can obtain it. But what about 'Lists/Task' and '/Lists/List2'? Where they are stored?
If try to avoid SQL context, I can formulate it the following way: what is the object, that has such attribute as '/Lists/List2'? Where can I set it up in GUI?
Just a FYI. It is VERY not supported to try and write directly to SharePoint's SQL Tables. You should really try and write something that utilizes the SharePoint Object Model. Writing to the SharePoint database directly mean Microsoft will not support the environment.
I've discovered, that [AllDocs] table, in contrast to its title, contains information about "directories", that can be used to generate tp_DirName. At least, I've found "List2" and "Task" entries in [AllDocs].[tp_Leaf] column.
So the solution looks like this -- concatenate the following 2 components to get tp_DirName:
[Webs].[FullUrl] for the web, containing list, containing item.
[AllDocs].[tp_Leaf] for the list, containing item.
Concatenate the following 2 components to get tp_Leaf for an item:
(Item count in the list) + 1
'_.000'
Regards,
Well, my previous answer was not very useful, though it had a key to the magic. Now I have a really useful one.
Whatever they said, M$ is very liberal to the MOSS DB hackers. At least they provide the following documents:
http://msdn.microsoft.com/en-us/library/dd304112(PROT.13).aspx
http://msdn.microsoft.com/en-us/library/dd358577(v=PROT.13).aspx
Read? Then, you know that all folders are listed in the [AllDocs] table with '1' in the 'Type' column.
Now, let's look at 'tp_RootFolder' column in AllLists. It looks like a folder id, doesn't it? So, just SELECT the single row from the [AllDocs], where Id = tp_RootFolder and Type = 1. Then, concatenate DirName + LeafName, and you will know, what the 'tp_DirName' value for a newly generated item in the list should be. That looks like a solid rock solution.
Now about tp_LeafName for the new items. Before, I wrote that the answer is (Item count in the list) + 1 + '_.000', that corresponds to the following query:
DECLARE #itemscount int;
SELECT #itemscount = COUNT(*) FROM [dbo].[AllUserData] WHERE [tp_ListId] = '...my list id...';
INSERT INTO [AllUserData] (tp_LeafName, ...) VALUES(CAST(#itemscount + 1 AS NVARCHAR(255)) + '_.000', ...)
Thus, I have to say I'm not sure that it works always. For items - yes, but for docs... I'll inquire into the question. Leave a comment if you want to read a report.
Hehe, there is a stored procedure named proc_AddListItem. I was almost right. MS people do the same, but instead of (count + 1) they use just... tp_ID :)
Anyway, now I know THE SINGLE RIGHT answer: I have to call proc_AddListItem.
UPDATE: Don't forget to present the data from the [AllUserData] table as a new item in [AllDocs] (just insert id and leafname, see how SP does it itself).