I have a table (see the image below --red box). It describes the content of my table (A, B, C, and D) are the columns. The data structure will always be like this, if col A is Type_1, only col B has a content while if Col A is Type_2, Col C and D has contents while col B is NULL.
Now, the table which re enclosed with green box is my desired output.
My experience on building a select statement is not very extensive and I'm almost leaning towards creating two separate tables to get my desired result (like 1 table for Type_1 data only and another table for Type_2 data only).
Question is, is it possible to query two rows and combine it to become a single output result using SELECT query? Considering that these two rows are on the same table?
Thanks.
Something like this:
SELECT
Table2Id,
MAX(B) B,
MAX(C) C,
MAX(D) D
FROM tbl
WHERE A != 'Type_3'
GROUP BY Table2Id
Assuming that there is only one row of data for type1 and one row of data for type 2, you can use the following:
SELECT Id, MAX(B) AS B, MAX(C) AS C, MAX(D) AS D
FROM Table2
WHERE A IN ('Type_1','Type_2')
GROUP BY Id
Example in this SQL Fiddle
You can make subqueries by enclosing them in parenthesis. As in:
SELECT (SELECT TOP 1 B FROM table ORDER BY some_ordering), (SELECT TOP 1 C FROM table WHERE NOT C IS NULL), D FROM table
The queries inside the parenthesis can apply to any table, and can use the data from the main query in calculations of the selected values and in filters.
Related
I have a SQL table with about 50 columns, the first represents unique users and the other columns represent categories which are scored 1-10.
Here is an idea of what I'm working with
user
a
b
c
abc
5
null
null
xyz
null
6
null
I am interested in counting the number of non-null values per column.
Currently, my queries are:
SELECT col_name, COUNT(col_name) AS count
FROM table
WHERE col_name IS NOT NULL
Is there a way to count non-null values for each column in one query, without having to manually enter each column name?
The desired output would be:
column
count
a
1
b
1
c
0
Consider below approach (no knowledge of column names is required at all - with exception of user)
select column, countif(value != 'null') nulls_count
from your_table t,
unnest(array(
select as struct trim(arr[offset(0)], '"') column, trim(arr[offset(1)], '"') value
from unnest(split(trim(to_json_string(t), '{}'))) kv,
unnest([struct(split(kv, ':') as arr)])
where trim(arr[offset(0)], '"') != 'user'
)) rec
group by column
if applied to sample data in your question - output is
I didn't do this in big-query but instead in SQL Server, however big query has the concept of unpivot as well. Basically you're trying to transpose your columns to rows and then do a simple aggregate of the columns to see how many records have data in each column. My example is below and should work in big query without much or any tweaking.
Here is the table I created:
CREATE TABLE example(
user_name char(3),
a integer,
b integer,
c integer
);
INSERT INTO example(user_name, a, b, c)
VALUES('abc', 5, null, null);
INSERT INTO example(user_name, a, b, c)
VALUES('xyz', null, 6, null);
INSERT INTO example(user_name, a, b, c)
VALUES('tst', 3, 6, 1);
And here is the UNPIVOT I did:
select count(*) as amount, col
from
(select user_name, a, b, c from example) e
unpivot
(blah for col in (a, b, c)
) as unpvt
group by col
Here's example of the output (note, I added an extra record in the table to make sure it was working properly):
Again, the syntax may be slightly different in BigQuery but I think thould get you most of the way there.
Here's a link to my db-fiddle - https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=deaa0e92a4ef1de7d4801e458652816b
How could you convert or transpose a range of data into a single column as shown above? Values could be ambiguous in data but output should contain unique values only.
(Updated after more information was provided in comments)
If your initial data comes from a query you could use a common table expression to do this:
with query_results (a,b,c) as (
... your original query that you have not shown us goes here ...
)
select a
from query_results
union
select b
from query_results
union
select c
from query_results
order by 1
The UNION operator will remove duplicates from the output
You can use UNPIVOT:
SELECT value
FROM your_table
UNPIVOT ( value FOR type IN ( a, b, c ) );
Using a SELECT, I want to find the row ID of 3 columns (each value is unique/dissimilar and is populated by separate tables.) Only the ID is auto incremented.
I have a middle table I reference that has 3 values: ID, A, B.
A is based on data from another table.
B is based on data from another table.
How can I select the row ID when I only know the value of A and B, and A and B are not the same value?
Do you mean that columns A and B are foreign keys?
Does this work?
SELECT [ID]
FROM tbl
WHERE A = #a AND B = #b
SELECT ID FROM table WHERE A=value1 and B=value2
It's not very clear. Do you mean this:
SELECT ID
FROM middletable
WHERE A = knownA
AND B = knownB
Or this?
SELECT ID
FROM middletable
WHERE A = knownA
AND B <> A
Or perhaps "I know A" means you have a list of values for A, which come from another table?
SELECT ID
FROM middletable
WHERE A IN
( SELECT otherA FROM otherTable ...)
AND B IN
( SELECT otherB FROM anotherTable ...)
In SQL Server, I have a table where a column A stores some data. This data can contain duplicates (ie. two or more rows will have the same value for the column A).
I can easily find the duplicates by doing:
select A, count(A) as CountDuplicates
from TableName
group by A having (count(A) > 1)
Now, I want to retrieve the values of other columns, let's say B and C. Of course, those B and C values can be different even for the rows sharing the same A value, but it doesn't matter for me. I just want any B value and any C one, the first, the last or the random one.
If I had a small table and one or two columns to retrieve, I would do something like:
select A, count(A) as CountDuplicates, (
select top 1 child.B from TableName as child where child.A = base.A) as B
)
from TableName as base group by A having (count(A) > 1)
The problem is that I have much more rows to get, and the table is quite big, so having several children selects will have a high performance cost.
So, is there a less ugly pure SQL solution to do this?
Not sure if my question is clear enough, so I give an example based on AdventureWorks database. Let's say I want to list available States, and for each State, get its code, a city (any city) and an address (any address). The easiest, and the most inefficient way to do it would be:
var q = from c in data.StateProvinces select new { c.StateProvinceCode, c.Addresses.First().City, c.Addresses.First().AddressLine1 };
in LINQ-to-SQL and will do two selects for each of 181 States, so 363 selects. I my case, I am searching for a way to have a maximum of 182 selects.
The ROW_NUMBER function in a CTE is the way to do this. For example:
DECLARE #mytab TABLE (A INT, B INT, C INT)
INSERT INTO #mytab ( A, B, C ) VALUES (1, 1, 1)
INSERT INTO #mytab ( A, B, C ) VALUES (1, 1, 2)
INSERT INTO #mytab ( A, B, C ) VALUES (1, 2, 1)
INSERT INTO #mytab ( A, B, C ) VALUES (1, 3, 1)
INSERT INTO #mytab ( A, B, C ) VALUES (2, 2, 2)
INSERT INTO #mytab ( A, B, C ) VALUES (3, 3, 1)
INSERT INTO #mytab ( A, B, C ) VALUES (3, 3, 2)
INSERT INTO #mytab ( A, B, C ) VALUES (3, 3, 3)
;WITH numbered AS
(
SELECT *, rn=ROW_NUMBER() OVER (PARTITION BY A ORDER BY B, C)
FROM #mytab AS m
)
SELECT *
FROM numbered
WHERE rn=1
As I mentioned in my comment to HLGEM and Philip Kelley, their simple use of an aggregate function does not necessarily return one "solid" record for each A group; instead, it may return column values from many separate rows, all stitched together as if they were a single record. For example, if this were a PERSON table, with the PersonID being the "A" column, and distinct contact records (say, Home and Word), you might wind up returning the person's home city, but their office ZIP code -- and that's clearly asking for trouble.
The use of the ROW_NUMBER, in conjunction with a CTE here, is a little difficult to get used to at first because the syntax is awkward. But it's becoming a pretty common pattern, so it's good to get to know it.
In my sample I've define a CTE that tacks on an extra column rn (standing for "row number") to the table, that itself groups by the A column. A SELECT on that result, filtering to only those having a row number of 1 (i.e., the first record found for that value of A), returns a "solid" record for each A group -- in my example above, you'd be certain to get either the Work or Home address, but not elements of both mixed together.
It concerns me that you want any old value for fields b and c. If they are to be meaningless why are you returning them?
If it truly doesn't matter (and I honestly can't imagine a case where I would ever want this, but it's what you said) and the values for b and c don't even have to be from the same record, group by with the use of mon or max is the way to go. It's more complicated if you want the values for a particular record for all fields.
select A, count(A) as CountDuplicates, min(B) as B , min(C) as C
from TableName as base
group by A
having (count(A) > 1)
you can do some thing like this if you have id as primary key in your table
select id,b,c from tablename
inner join
(
select id, count(A) as CountDuplicates
from TableName as base group by A,id having (count(A) > 1)
)d on tablename.id= d.id
I want to select some rows from a table.
Along with the normal columns that I get back from the query result, I also need to append some additional field in the result set.
I am exporting the table in to the csv file. The output file will have some extra fields along with the normal column values that were returned from the query.
For ex table has column A, B, and C. which has
A B C
11 Bob S
12 Gary J
13 Andy K
Now my output file should have
11, Bob , S, "DDD" , 1
12, Gary, J, "DDD" , 2
13, Andy, K , "DDD" , 3
(One way was to first do select A,B,C from test_table and then manipulate the file by appending the 2 fields manually)
Is there any way that I can get the extra 2 values in the query itself as hardcoded values ?
If using SQL Server 2005+, using the ROWNUMBER function should do fine.
SELECT A, B, C, 'DDD' AS D,
ROW_NUMBER() OVER (ORDER BY A) AS E
FROM Table
Use can use a sub query to get the incremental value. This relies upon an order by and unique values in column A.
SELECT
A,
B,
C,
'DDD' [D],
(SELECT COUNT(*) FROM test_table test_table2 WHERE test_table2.A <= test_table.A) [E]
FROM
test_table
ORDER BY
A
This is blindly answering your question... I have a feeling there is more to your question and much better ways of actually achieving your final goal.
SELECT A, B, C, 'DDD' as "D", 1 as "E"
FROM myTable
WHERE ...
Is the 1, 2, 3 hardcoded, or does this need to be incremented value?
Yes, in fact this is pretty easy. It would be something like this:
SELECT field1, field2, 'Default' AS field3, 'Default2' AS field4 FROM table WHERE ...
You can get the incrementing field with oracle with the pseudo column rownum or row_num
I keep forgetting. But this will differ for other databases.
I'm gussing the last field is a row counter and you don't specify the database, but for Oracle this should work:
select A, B, C, 'DDD', rownum
from ex;
SELECT A,
B,
C,
'DDD' AS D,
ROW_NUMBER() OVER (ORDER BY A) AS 'E'
FROM myTable
Also look here for an answer sql-query-that-numerates-the-returned-result