SQL View Summarizing One Table, Columns Based on Unknown Categories Entered - sql

I have a table in the form:
date / category (string) / count (integer)
--------------------------------------------
7/15 A 3
7/15 B 7
7/15 C 2
7/16 A 9
7/16 B 1
7/16 C 2
Basically, for each day, each category will have a count associated with it.
The problem is, I don't necessarily know what these categories will end up being. Say I know they are A, B, and C, but next week, there is a D, E, and F.
And this is the view that I want to build:
Date / A / B / C / .. (however many categories found)
---------------------------------------------------------
7/15 3 5 2 3 4
7/16 9 5 9 6 4
...
..
.
I usually know enough SQL to get by, but this one is racking my brain. I don't think I am using the right vocabulary when trying to google it, because I'm not finding the answers I am looking for.

The answer is simple, you cannot build a view to do what you would like. A view has its columns pre-defined.
You could do one of the following:
Create a stored procedure that creates a view every week. This stored procedure would analyze the data, determine the columns, and then use dynamic SQL to alter the view.
Change the definition of what you want and put the values in a single column, separated by commas (or some other character).
Predefine a list of acceptable columns, create the view (using pivot, say) and then periodically go through an modify it when new values arise.
Do the pivoting at the application layer. This is particularly easy in Excel.
One big caveat with (1) and (3). If anything uses the view as "select * from view", you need to be sure that those queries/stored procedures/user defined functions/etc. are recompiled. Otherwise, they will have the wrong list of columns (this may only apply to SQL Server).

Related

Opensearch SQL find common elements in two columns

I have some issues with Opensearch SQL engine. My queries do not work on OpenSearch.
For example I have two columns (A and B)
Column #1
Column #2
A
1
A
2
A
3
A
4
B
1
B
2
B
3
C
3
I need to get all the common values in column #2 that are for every element of column #1.
In this example output is 3.
Something more : I want to be able to do this with some of the items in column 1. If I want to do it only with A and B, the ouput is : 1,2,3.
Can you help me to find a query (compatible with Opensearch) ? to get this result.
Or any other solution other than SQL.
Thank you,
Vincent

sql use different columns for same query (directed graph as undirected )

Suppose I have a table of relationships like in a directed graph. For some pairs of ids there are both 1->2 and 2->1 relations, for others there are not. Some nodes are only present in one column.
a b
1 2
2 1
1 3
4 1
5 2
Now I want to work with it as undirected graph. For example, grouping, filtering using both columns present. For example filter node 5 and count neighbors of the rest
node neighbor_count
1 3
2 1
3 1
4 1
Is it possible to compose queries in such a way that first column a is used and then column b is used in the same manner?
I know it is achievable by doubling the table:
select a,count(distinct(b))
from
(select * from grap
union all
select b as a, a as b from grap)
where (not a in (5,6,7)) and (not b in (5,6,7))
group by a;
However, the real tables are quite large (10^9 - 1^10 of pairs). Would union require additional disk usage? A single scan through the base is already quite slow for me. Are there better ways to do this?
(Currently database is sqlite, but the less platform specific the answer the better)
The union all is generated only for the duration of the query. Does it use more disk space? Not permanently.
If the processing of the query requires saving the data out to disk, then it will use more temporary storage for intermediate results.
I would suggests, though, that if you want an undirected graph with this representation, then add in the addition pairs that are not already in the table. This will use more disk space. But you won't have to play games with queries.

How can I reduce complexity? Data preparation, SQL + Tableau

I need to prepare some data to connect to tableau, and I'm struggling because the size of the data is too much for tableau to handle, so I'm looking for ideas to code this efficiently in SQL.
Setup:
I have 2 million users
There are 30 different categories, and each user can fall into many. For example:
User 1 - Category A, B and C
User 2 - Category F
User 3 - Category A, B
What I want:
Select three categories and assign priority 1, priority 2 and priority 3
These selection is not static, so today I may choose A, B, C but tomorrow those categories can be D, G, A
So if I have:
Priority 1: A
Priority 2: B
Priority 3: C
I want the number of users who fall into category A
I want the number of users who fall into category B AND are not in category A
I want the number of users who fall into category C AND are not in category A or B
My original idea was to create a table with one row per user and one yes/no column per category, and then aggregate, but still the size of the final table is too huge for tableau to handle.
Any ideas?
Update: My idea is to prepare a table with aggregated numbers and a few thousand rows max, so that it can be processed with tableau
You can assign each of the 30 categories a unique placeholder 1 to 30. Each user will be thereafter assigned a binary number of 30digits based on the categories he is falling in. This binary number can then be converted into decimal number the greatest of which can be 2^31-1 i.e. 10 digit number which can be stored without exp format.
Whenever you will have to see the categories user falling in that can be done by applying reverse conversion i.e. decimal to binary and thereafter to string with padding zeros on left side. From this string you can search places of 1s at desired place.
I think you can try this methodology.

How to combine a row of cells in VBA if certain column values are the same

I have a database where all of the input from the user (through a userform) gets stored. In the database, each column is a different category for the type of data (ex. date, shift, quantity, etc) and the data from the userform input gets put into its corresponding category. For some of the data, all the data is the same except for the quantity. I was wondering how I could combine these rows into one and add the quantities to each other for the whole database (ex. combining the first and third data entries). I have tried playing around with a couple different loops but can't seem to figure anything out.
Period Date Line Shift Type Quantity
4 x 2 4/3/18 A 3 14 18
4 x 2 4/3/18 A 3 13 12
4 x 2 4/3/18 A 3 14 15
Thank you!
If you're looking to modify the underlying database, you might be able to query the data into the format you want by including all the other columns in a GROUP BY statement, save the result to another table, then replace the original table with the properly formatted one.
If you have the data in Excel and you just want to view it with the duplicate rows summed, a Pivot Table would be a good choice. You can select all the other columns as rows for the Pivot Table and sum of Quantity as the values.

SQL add column value based on another column ACCESS

What I'm trying to do is add another column to an existing table whose value will depend on an already existing column in the table. For example say I have this table:
Table1
|Letter|
A
C
R
A
I want to create another column (for example, numbers) that is chosen based on the letters. So let's say A corresponds with 10, C with 3 and R with 32 (this was chosen at random). My resulting table should be like this:
|Letter| Number |
A | 10
C | 3
R | 32
A | 10
Can anyone help me write a query that does this..I have over 20 different cases, so the simpler it looks the better.
Thanks in advance!
Options:
Build a table that associates [Letter] with the numeric value. Include this table in query by joining on the common [Letter] fields.
A very long Switch() expression. However, query design grid cell has a limit of 1024 characters.
Better to provide example with your real data and criteria.