How convert a table format or it structure with Google Refine - openrefine

I have a table with the following format:
ID Estation Y M D H N Nh h Cl
1 78357 2017 5 1 1 0 0 -9001 0
2 78357 2017 5 1 2 0 0 -9001 0
3 78357 2017 5 1 3 1 1 750 5
I want to convert the data in this table to the following format:
ID Estation Y M D H Var Value
1 78357 2017 5 1 1 N 0
2 78357 2017 5 1 2 N 0
3 78357 2017 5 1 3 N 1
4 78357 2017 5 1 1 Nh 0
5 78357 2017 5 1 2 Nh 0
6 78357 2017 5 1 3 Nh 1
7 78357 2017 5 1 1 h -9001
8 78357 2017 5 1 2 h -9001
9 78357 2017 5 1 3 h 750
10 78357 2017 5 1 1 Cl 0
11 78357 2017 5 1 2 Cl 0
12 78357 2017 5 1 3 Cl 5
Due to the amount of registration I must take from one format to another I want to do it using Google Refine. Someone has any idea how to do it?.

You can do this in Google Refine (now called OpenRefine) using the Transpose option.
In the 'N' column click the drop down menu and choose "Transpose -> Transpose cells across columns into rows"
In the screen shown choose "N" as the "From Column" and "(last column)" as the "To Column"
Choose to Transpose into Two New Columns. Call the Key column "Var" and the Value column "Value"
Check the box that says "Fill down in other columns"
Click Transpose
This should give you the various variables & values in a single column with multiple rows
To sort in the order you give in your example maybe challenging. If you Sort the Var col in reverse alphabetical order it is close although not quite - not sure how important this is to you.
Remember in OpenRefine you have to choose to Reorder Rows Permanently to commit the new sort order.
You may have to transform the ID column to renumber with unique IDs. You can do this with the GREL rowIndex+1 once you have got the sort order correct

Related

How to count unique combinations of Co-ordinates to find most customers in grid section

I have a customer table with their closest delivery hub on a grid based system and need to calculate what is the most populated area using a query.
This is the current query I have that lists all of the Co-ordinates per Customer.
SELECT Customers.HubID, TO_CHAR(Hubs.HubCoordX, 'FM999999999999') as "X Co-ordinate", TO_CHAR(Hubs.HubCoordX, 'FM999999999999') AS "Y Co-ordinate" FROM Customers INNER JOIN Hubs ON Customers.HubID = Hubs.DestinationID ORDER BY Hubs.HubCoordX, Hubs.HubCoordY
This query creates the following result.
HubID
X Co-ord
Y Co-ord
9
-3
1
11
-2
18
2
0
0
3
0
0
3
0
0
1
0
0
1
0
0
3
0
0
4
3
1
5
3
1
7
7
3
But I need a result like this
X Co-ordinate
Y Co-ordinate
Population
-3
1
1
-2
18
1
0
0
6
3
1
2
7
3
1
Thanks in advance
I have attempted use Count Unique however it resulted in only counting individual Co-ordinates once.
SELECT TO_CHAR(Hubs.HubCoordX, 'FM999999999999') as "X Co-ordinate",
TO_CHAR(Hubs.HubCoordX, 'FM999999999999') AS "Y Co-ordinate", Count(HubID) as "Population"
FROM Customers
INNER JOIN Hubs ON Customers.HubID = Hubs.DestinationID
Group BY Hubs.HubCoordX, Hubs.HubCoordY

How to merge two rows if same values in sql server

I have the Following Output:
Sno
Value Stream
Duration
Inspection
1
Test1
3
1
2
ON
14
0
3
Start
5
0
4
Test1
5
1
5
OFF
0
1
6
Start
0
1
7
Test2
0
1
8
ON
3
1
9
START
0
1
10
Test2
2
2
I want to merge the same value after that before START values charge to after ON. For example S.no 4 will merge to s.no4.
1 | Test1 | 8 | 2 |
If the combination is not equal then don't allow it to merge. For Example, we have to consider only On/Start. If the condition is OFF/Start then don't allow to merge. E.g. S.no 5 and 6 OFF/Start then don't allow to merge s.no 4 & 7.
I think you are talking about summarization not merging:
select [Value Stream],
min(Sno) as First_Sno,
sum(Duration) as total_Duration,
sum(Inspection) as Inspection
from yourtable
group by [Value Stream]
Will give you the result

if statement in excel, adding 1 if cell with text but

I am creating an excel sheet that has three columns. Detail, month and month count
1 -- I would like for the formula to look at the detail column and if there is text add the previous cell number plus 1 to new month count, if not insert 0
2-- I would like the formula to add the previous cell before the cell with 0 and for the cell with 0 not to impact the other cells or reset the cells back to 1 witch is the problem am having
3-- I also need the formula to reset for every month from what ever number it was back to 0 or 1 depending if the new month first cell has text or not. for this I need the formula to look at the month column
This is what I have so far:
=IF(ISTEXT(G95), I94+ 1, 0)
The formula for the count column should be as follows.
=IF(A2<>"",COUNTIF($B$1:B2,B2)-COUNTIFS($A$1:A2,"",$B$1:B2,B2),0)
Breakdown of how this works:
A2<>"" Will check if the detail column is populated
COUNTIF($B$1:B2,B2) will figure out how many entries are above this row that reference the same month.
COUNTIFS($A$1:A2,"",$B$1:B2,B2) Will find how many cells are blank provided that it also matches the month. This subtracted from the previous section gives you how many are not blank.
The IF will return 0 if the detail is empty.
Which returned the following data
Orderly Random
Det Mon Count Det Mon Count
X 1 1 2 0
X 1 2 X 1 1
X 1 3 X 1 2
1 0 2 0
X 1 4 X 2 1
X 2 1 X 1 3
X 2 2 X 1 4
2 0 1 0
2 0 1 0
2 0 2 0
3 0 3 0
X 3 1 X 3 1
3 0 1 0
X 3 2 3 0
X 3 3 X 1 5
3 0 X 2 2
X 3 4 X 3 2
3 0 3 0
X 3 5 3 0
X 3 6 2 0
It sounds like you want to keep a running total for the month count in the column and put a 0 if there is not text. If that is the case, you can put this formula in I95.
=IF(ISTEXT(G95),MAX($I$2:I94)+1, 0)

MDX: iif condition on the value of dimension

I have 1 Virtual cube consists of 2 cubes.
Example of fact table of 1st cube.
id object_id time_id date_id state
1 10 2 1 0
2 11 5 1 0
3 10 7 1 1
4 10 3 1 0
5 11 4 1 0
6 11 7 1 1
7 10 8 1 0
8 11 5 1 0
9 10 7 1 1
10 10 9 1 2
Where State: 0 - Ok, 1 - Down, 2 - Unknown
For this cube I have one measure StateCount it should count States for each object_id.
Here for example we have such result:
for 10 : 3 times Ok , 2 times Down, 1 time Unknown
for 11 : 3 times Ok , 1 time Down
Second cube looks like this:
id object_id time_id date_id status
1 10 2 1 0
2 11 5 1 0
3 10 7 1 1
4 10 3 1 1
5 11 4 1 1
Where Status: 0 - out, 1 - in. I keep this in StatusDim.
In this table I keep records that should not be count. If object have status 1 that means that I have exclude it from count.
If we intersect these tables and use StateCount we will receive this result:
for 10 : 2 times Ok , 1 times Down, 1 time Unknown
for 11 : 2 times Ok , 1 time Down
As far as i know, i must use calculated member with IIF condition. Currently I'm trying something like this.
WITH MEMBER [Measures].[StateTimeCountDown] AS(
iif(
[StatusDimDown.DowntimeHierarchy].[DowntimeStatus].CurrentMember.MemberValue
<> "in"
, [Measures].[StateTimeCount]
, null )
)
The multidimensional way to do this would be to make attributes from your state and status columns (hopefully with user understandable members, i. e. using "Ok" and not "0"). Then, you can just use a normal count measure on the fact tables, and slice by these attributes. No need for complex calculation definitions.

tSQL how to write a view/function which returns a table of dynamic size

I have recently written a script in t-SQL which uses dynamic SQL to generate a table. The output of the script varies, depending on when it is run. The output is something like this:
Group 2010 2011 2012 2013
A 1 2 3 2
B 4 3 3 4
C 4 3 1 1
However, each year another year is added onto the table, meaning the table size varies.
e.g.
Group 2010 2011 2012 2013 2014
A 1 2 3 2 2
B 4 3 3 4 2
C 4 3 1 1 3
I need to be able to access the data in this table via access to generate some reports, so require some sort of view or function to get the data.
What is the best way of doing this?
if you have to use this output in report. Than you have to fix column name in SQL as below.
Group year4 year3 year2 year1
A 1 2 3 2
B 4 3 3 4
C 4 3 1 1
and in report tools you can convert year1 = current year, year2 = current year - 1 and so on.
update 2
using this method you can easily design your report.
Group year5 year4 year3 year2 year1
A 1 2 3 2 2
B 4 3 3 4 2
C 4 3 1 1 3