How to add column with the value of another dimension? - ssas

I appologize if the title does not make sense. I am trying to do something that is probably simple, but I have not been able to figure it out, and I'm not sure how to search for the answer. I have the following MDX query:
SELECT
event_count ON 0,
TOPCOUNT(name.children, 10, event_count) ON 1
FROM
events
which returns something like this:
| | event_count |
+---------------+-------------+
| P Davis | 123 |
| J Davis | 123 |
| A Brown | 120 |
| K Thompson | 119 |
| R White | 119 |
| M Wilson | 118 |
| D Harris | 118 |
| R Thompson | 116 |
| Z Williams | 115 |
| X Smith | 114 |
I need to include an additional column (gender). Gender is not a metric. It's just another dimension on the data. For instance, consider this query:
SELECT
gender.children ON 0,
TOPCOUNT(name.children, 10, event_count) ON 1
FROM
events
But this is not what I want! :(
| | female | male | unknown |
+--------------+--------+------+---------+
| P Davis | | | 123 |
| J Davis | | 123 | |
| A Brown | | 120 | |
| K Thompson | | 119 | |
| R White | 119 | | |
| M Wilson | | | 118 |
| D Harris | | | 118 |
| R Thompson | | | 116 |
| Z Williams | | | 115 |
| X Smith | | | 114 |
Nice try, but I just want three columns: name, event_count, and gender. How hard can it be?
Obviously this reflects lack of understanding about MDX on my part. Any pointers to quality introductory material would be appreciated.

It's important to understand that in MDX you are building sets of members on each axis, and not specifying column names like a tabular rowset. You are describing a 2-dimensional grid of results, not a linear rowset. If you imagine each dimension as a table, the member set is the set of unique values from a single column in that table.
When you choose a Measure as the member (as in your first example), it looks as if you're selecting from a table, so it's easy to misunderstand. When you choose a Dimension, you get many members, and a cross-join between the rows and columns (which is sparse in this case because the names and genders are 1-to-1).
So, you could crossjoin these two dimensions on a single axis, and then filter out the null cells:
SELECT
event_count ON 0,
TOPCOUNT(
NonEmptyCrossJoin(name.children, gender.children),
10,
event_count) ON 1
FROM
events
Which should give you results that have a single column (event_count) and 10 rows, where each row is composed of the tuple (name, gender).
I hope that sets you on the right path, and please feel free to ask you want me to clarify.
For general introductory material, I think the book "MDX Solutions" is a good place to start:
http://www.amazon.ca/MDX-Solutions-Microsoft-Analysis-Services/dp/0471748080/

For an online MDX introductory material, you can have a look to this gentle introduction that presents the main MDX concepts.

Related

Using Snowflake SQL how do you find two records and then change another one based on those records to a predefined record using a local variable?

Using SQL how do you use two records to find a place, hold onto that place and use that record to replace 'Nonsense' value with that held onto place? I am going to show what I have been able to write so far, but then write out what I am still trying to figure out:
SELECT * FROM "TABLES". "ACCTS_OF_SUPERHEROS".;
DECLARE #count_rows INT = 0;
DECLARE #row_total INT = 0;
DECLARE #refAcctNum INT = 0;
DECLARE #selectedPlaceName TINYTEXT;
SET #row_total = SELECT COUNT (*)
WHILE countRows < row_total
for each acct_num store value in refAcctNum.
Using refAcctNum find place: "Gotham City", "Central City", "Metropolis", "Smallville", "Star City", "Fawcett City" store that in selectedPlaceName.
If refAccountNumber has Nonsense then replace with selectedPlaceName record
otherwise add + 1 to countRows and repeat.
END
Current table data; "ACCTS_OF_SUPERHEROS" table:
| row | acct_num | exact_address | place
| --- | -------- |------------------|--------
| 1 | 049403 | 344 Clinton Str | Metropolis
| 2 | 049403 | 344 Clinton Str | Nonsense
| 3 | 049206 | 1007 Mountain Dr | Gotham City
| 4 | 049206 | 1007 Mountain Dr | Gotham City
| 5 | 049206 | 1096 Show Dr. | Fawcett City
| 6 | 049206 | 1096 Show Dr. | Nonsense
| 7 | 049206 | NULL | Nonsense
| 8 | 049291 | 1938 Sullivan Pl | Smallville
| 9 | 049293 | 700 Hamilton Str | Central City
| 10 | 049396 | 800 Nonsense Way | Nonsense
| 11 | 049396 | NULL | Nonsense
Desired output:
| row | acct_num | exact_address | place
| --- | -------- |------------------|--------
| 1 | 049403 | 344 Clinton Str | Metropolis
| 2 | 049403 | 344 Clinton Str | Metropolis
| 3 | 049206 | 1007 Mountain Dr | Gotham City
| 4 | 049206 | 1007 Mountain Dr | Gotham City
| 5 | 049206 | 1096 Show Dr. | Fawcett City
| 6 | 049206 | 1096 Show Dr. | Fawcett City
| 7 | 049206 | NULL | Fawcett City
| 8 | 049291 | 1938 Sullivan Pl | Smallville
| 9 | 049293 | 700 Hamilton Str | Central City
| 10 | 049396 | 800 Tidal Way | Star City
| 11 | 049396 | NULL | Star City
You can use window functions:
select t.*,
max(case when place <> 'Nonsense' then place end) over (partition by acct_num) as imputed_place
from t;
This returns NULL if all the rows are 'Nonsense' for a given acct_num. You can use COALESCE() to replace the value with something else.
I was reading through the available list of window functions in Snowflake and think you're going to need a new window function for this. Perhaps someone can find a more built-in way, but anyway here's a user defined table function REPLACE_WITH_LKG implemented as a window function that will replace a bad value with the last known good value. As long as I was going to write it, I thought it may as well be general purpose, so it matches "bad" values using a regular expression and JavaScript RegExp options.
create or replace function REPLACE_WITH_LKG("VALUE" string, "REGEXP" string, "REGEXP_OPTIONS" string)
returns table(LKG_VALUE string)
language javascript
strict immutable
as
$$
{
initialize: function (argumentInfo, context) {
this.lkg = "";
},
processRow: function (row, rowWriter, context) {
const rx = new RegExp(row.REGEXP, row.REGEXP_OPTIONS);
if (!rx.test(row.VALUE)) {
this.lkg = row.VALUE;
}
rowWriter.writeRow({LKG_VALUE: this.lkg});
},
finalize: function (rowWriter, context) {},
}
$$;
select S.*, LKG.LKG_VALUE as PLACE
from superhero S, table(REPLACE_WITH_LKG(PLACE, 'Nonsense', 'ig')
over(partition by null order by "ROW")) LKG;
;
A note on performance; the way the data shows this the're no partition other than the entire table. That's because the one obvious place to partition, by account, won't work. Row 10 is getting its value from what would be a different window if using account, so the way the sample data appears it needs to be a window that spans the entire table. This will not parallelize well and should be avoided for very large tables.

compare two columns in PostgreSQL show only highest value

This is my table
I'm trying to find in which urban area having high girls to boys ratio.
Thank you for helping me in advance.
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| Ran | 100 | 120 |
| Ran | 110 | 105 |
| dhanr | 80 | 73 |
| dhanr | 140 | 80 |
| mohan | 180 | 73 |
| mohan | 25 | 26 |
This is the query I used, but I did not get the expected results
SELECT urban, Max(allboys) as high_girls,Max(allgirls) as high_boys
from table_urban group by urban
Expected results
| urban | allgirls | allboys |
| :---- | :------: | :-----: |
| dhar | 220 | 153 |
First of all your example expected result doesn't seems correct because the girls to boys ratio is highest in "mohan" and not in "dhanr" - If what you are really looking for is the highest ratio and not the highest number of girls.
You need to first group and find the sum and then find the ratio (divide one with other) and get the first one.
select foo.urban as urban, foo.girls/foo.boys as ratio from (
SELECT urban, SUM(allboys) as boys, SUM(allgirls) as girls
FROM table_urban
GROUP BY urban) as foo order by ratio desc limit 1
SELECT urban, SUM(allboys) boys, SUM(allgirls) girls
FROM table_urban
GROUP BY urban
ORDER BY boys / girls -- or backward, "girls / boys"
LIMIT 1

How do I make this calculated measure axis independent and portable?

So I am a beginner at MDX and I have an MDX query that works the way I want it to so long as I put the set on either the columns or rows. If I put the same set on the filter axis it doesn't work. I'd like to make this calculated measure is independent on where this set lives. I'm guaranteed to always have some form of a set included, but I'm not guaranteed which axis the user will place it on (eg row, columns, filter).
Here is the query that works:
WITH MEMBER Measures.avgApplicants as
Avg([applicationDate].[yearMonth].[month].Members, [Measures].[applicants])
SELECT
{[Measures].[applicants],[Measures].[avgApplicants]} ON 0,
{[applicationDate].[yearMonth].[year].[2015]:[applicationDate].[yearMonth].[year].[2016]} ON 1
FROM [applicants]
And results:
| | applicants | avgMonthlyApplicants |
+------+------------+----------------------+
| 2015 | 367 | 33 |
| 2016 | 160 | 33 |
However, if I shift this query around to move the set onto the filter axis I get nothing:
WITH MEMBER Measures.avgApplicants as
Avg([applicationDate].[yearMonth].[month].Members, [Measures].[applicants])
SELECT
{[Measures].[applicants],[Measures].[avgApplicants]} ON 0,
{[Gender].Members} ON 1
FROM [applicants]
WHERE ([applicationDate].[yearMonth].[year].[2015]:[applicationDate].[yearMonth].[year].[2016])
I get this:
| | applicants | avgApplicants |
+-------------+-------------+------------+---------------+
| All Genders | | 478 | |
| | Female | 172 | |
| | Male | 183 | |
| | Not Known | 61 | |
| | Unspecified | 62 | |
So how do a create this calculated measure work so that it isn't dependent on which axis the set is placed on?

Increment value when the field is the same

First, I'm sorry for the ambiguous title.
Here's my problem :
I'm using Access and I have this table :
+--------+-----------+
| PARENT | CHILD |
+--------+-----------+
| JOHN | TANIA |
| JOHN | ROBERT |
| JOHN | APRIL |
| HELEN | TOM |
| HELEN | GABRIELLE |
+--------+-----------+
And I would like to add a column like this with queries or VBA code :
+--------+-----------+---------+
| PARENT | CHILD | LIST |
+--------+-----------+---------+
| JOHN | TANIA | CHILD 1 |
| JOHN | ROBERT | CHILD 2 |
| JOHN | APRIL | CHILD 3 |
| HELEN | TOM | CHILD 1 |
| HELEN | GABRIELLE | CHILD 2 |
+--------+-----------+---------+
I want to do this because at the end, I want to run a cross tab query. I'm only missing that last column to create that query.
I tried to do it in a recordset, but my database starts bloating after a couple of rst.Update (I have 700k+ rows)
I created a temporary table and used UPDATE queries but it just takes too much time.
I think there might be a SQL code that would do what I need, but I just can't figure it out. I hope you could help me, thanks :)
You can do something like the below, but it would be much better with some sort of IDs:
SELECT Parent.PARENT,
Parent.CHILD,
(SELECT Count(*)
FROM Parent p
WHERE p.Parent=Parent.Parent
AND p.Child<=Parent.Child) AS ChildNo
FROM Parent
ORDER BY Parent.PARENT, Parent.CHILD;
Parent is the name of the table.

Query to compare values across different tables?

I have a pair of models in my Rails app that I'm having trouble bridging.
These are the tables I'm working with:
states
+----+--------+------------+
| id | fips | name |
+----+--------+------------+
| 1 | 06 | California |
| 2 | 36 | New York |
| 3 | 48 | Texas |
| 4 | 12 | Florida |
| 5 | 17 | Illinois |
| … | … | … |
+----+--------+------------+
places
+----+--------+
| id | place |
+----+--------+
| 1 | Fl |
| 2 | Calif. |
| 3 | Texas |
| … | … |
+----+--------+
Not all places are represented in the states model, but I'm trying to perform a query where I can compare a place's place value against all state names, find the closest match, and return the corresponding fips.
So if my input is Calif., I want my output to be 06
I'm still very new to writing SQL queries, so if there's a way to do this using Ruby within my Rails (4.1.5) app, that would be ideal.
My other plan of attack was to add a fips column to the "places" table, and write something that would run the above comparison and then populate fips so my app doesn't have to run this query every the page loads. But I'm very much a beginner, so that sounds... ambitious.
This is not an easy query in SQL. Your best bet is one of the fuzzing string matching routines, which are documented here.
For instance, soundex() or levenshtein() may be sufficient for what you want. Here is an example:
select distinct on (p.place) p.place, s.name, s.fips, levenshtein(p.place, s.name) as dist
from places p cross join
states s
order by p.place, dist asc;