Query to write extra rows in Excel output - sql

I'm trying to accomplish something that seems like it should be straightforward in MS Excel. I want to use a single SQL query - so I can pass it on to others to copy and paste - though I know the following could be achieved with other methods as well. Sheet 1 looks like this:
ID value value_type
1 minneapolis city_name
2 cincinnati city_name
I want an SQL query to return an "exploded" version of those two rows:
ID attr_name attr_value
1 value minneapolis
1 value_type city_name
2 value cincinnati
2 value_type city_name
There's much more I need to do, but this concept gets at the heart of the issue. I've tried a single SELECT statement, but can't seem to make it create two rows from one, and when I tried using UNION ALL I got a syntax error.
In Microsoft Query, how can I construct an SQL statement to create two rows from the existing values in one row?
UPDATE
thanks for the help so far. First, for reference, here is the default statement that recreates the table in Microsoft Query:
SELECT
`Sheet3$`.ID,
`Sheet3$`.name,
`Sheet3$`.name_type
FROM `path\testconvert.xlsx`.`Sheet3$` `Sheet3$`
So, following #lad2025's lead, I have:
SELECT
ID = `Sheet3$`.ID
,attr_name = 'value'
,attr_value = `Sheet3$`.value
FROM `path\testconvert.xlsx`.`Sheet3$` `Sheet3$`
UNION ALL
SELECT
ID = `Sheet3$`.ID
,attr_name = 'value_type'
,attr_value = `Sheet3$`.value_type
FROM `path\testconvert.xlsx`.`Sheet3$` `Sheet3$`
And the result is this error Too few parameters. Expected 4.

LiveDemo
CREATE TABLE #mytable(
ID INTEGER NOT NULL PRIMARY KEY
,value VARCHAR(11) NOT NULL
,value_type VARCHAR(9) NOT NULL
);
INSERT INTO #mytable(ID,value,value_type) VALUES (1,'minneapolis','city_name');
INSERT INTO #mytable(ID,value,value_type) VALUES (2,'cincinnati','city_name');
SELECT
ID
,[attr_name] = 'value'
,[attr_value] = value
FROM #mytable
UNION ALL
SELECT
ID
,[attr_name] = 'value_type'
,[attr_value] = value_type
FROM #mytable
ORDER BY id;

Ok, after going back to the original statement and working up from there as per the suggestions from #lad2025, I've come up with this statement which achieves what I was looking for in my original question:
SELECT
ID,
'name' AS [attr_name],
name AS [attr_value]
FROM `path\testconvert.xlsx`.`Sheet3$` `Sheet3$`
UNION ALL
SELECT
ID,
'name_type',
name_type
FROM `path\testconvert.xlsx`.`Sheet3$` `Sheet3$`
ORDER BY ID;
One of the main problems is that the new column names are only defined in the first SELECT statement. Also, brackets are ok, just not how #lad2025 was using them originally.
Microsoft Query is pretty finicky.

Related

Struggling with a complicated query on row-based Field/Value table

Bare with me for a little bit of setup here please.
I have a table MAIN that has a Field/Value representation that looks like this:
I have another table called STORE_FLAG:
I am trying to write a parameterized query for which I will be given one FIELD_ID and one or more IDs from the STORE_FLAG table.
What I need to do is select from the MAIN table ROW_IDs where:
for the given FIELD_ID, the VALUE = 'YES' AND
for the given STORE_FLAG_IDS, ANY of those FIELD_IDs correspond to a VALUE = 'x' in the MAIN table.
Not that this would be a good idea, but I cannot pivot the whole table into a column-based table to then do a traditional where clause.
Example:
Given a Field_Id = 1 and a list of StoreIds = (30,50). I would want to return row_ids 1 and 2. This is because row_id 1 and 2 have a field_id 1 with value 'YES' AND at least one of the field_ids 3 and 5 have a value 'x'. But row_id 3 has a value of null for both field_id 3 and 5 and row_id 4 has a field_id 1 with value = 'NO'.
I was thinking something like this:
SELECT DISTINCT ROW_ID FROM MAIN
WHERE (FIELD_ID = :providedFieldId OR FIELD_ID IN (SELECT FIELD_ID FROM STORE_FLAG WHERE ID IN :providedStoreIdList))
AND (FIELD_VALUE = 'YES' OR FIELD_VALUE = 'x')
which (I think) works, but feels naïve to me..? I feel like there is some sort of super duper grouping way to do this, but I can't wrap my head around it. Any suggestions would be really appreciated.
here is a way to do this
select distinct m.row_id
from main m
where m.field_id=:providedFieldId
and m.field_value='YES'
and exists (select 1
from STORE_FLAG sf
join main m2
on sf.field_id=m2.field_id
where sf.id in ('30','50') /* you need to bind the values from :providedStoreIdList using a table function*/
and m2.field_value='x'
and m2.row_id=m.row_id
)
link on how to bind an in list
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:110612348061
Your provided solution /query will not work as you say. Because in your last line of query [AND (FIELD_VALUE = 'YES' OR FIELD_VALUE = 'x')] conflict with your requirement . Using your query, you will get ROW_ID if any one is true either FIELD_VALUE = 'YES' OR FIELD_VALUE = 'x'. Which is wrong. You can see below query-
SELECT SUB_QUERY.ROW_ID FROM
(
select DISTINCT MAIN.ROW_ID,MAIN.FIELD_VALUE from STORE_FLAG
RIGHT OUTER JOIN MAIN ON STORE_FLAG.FIELD_ID=MAIN.FIELD_ID
WHERE ((STORE_FLAG.ID IN ('202','203') AND MAIN.FIELD_VALUE='x')
OR (MAIN.FIELD_ID ='1' AND MAIN.FIELD_VALUE='YES'))
) SUB_QUERY
GROUP BY SUB_QUERY.ROW_ID
HAVING (LISTAGG(SUB_QUERY.FIELD_VALUE, ',') WITHIN GROUP (ORDER BY SUB_QUERY.ROW_ID) IN ('YES,x','x,YES'))
I think you need to run and understand my sub query part at first.

Want to concatenate column of the second query to the first query but getting errors such as "query block has incorrect number of result columns"

SELECT
ID, PRIM_EMAIL, SEC_EMAIL, PHONE
FROM
STUDENTS.RECORDS
WHERE
ID IN (SELECT ID FROM STUDENTS.INFO WHERE ROLL_NO = '554')
UNION
SELECT NAME
FROM STUDENTS.INFO
WHERE ROLL_NO = '554';
Here Roll_No is a user inserted data so for now I have hard coded it. Basically with the help of ROLL_NO I sort the STUDENTS_INFO table from where I get the ID and based on that I try to get PRIM_EMAIL, SEC_EMAIL, PHONE from the STUDENTS.RECORDS table while matching the foreign keys of both the tables. In addition to the current result set I also want to have the prov_name column.
Any help is very much appreciated. Thank you!
I suspect that you want to put all this information on the same row, which suggests a join rather than union all:
select
r.ID,
r.PRIM_EMAIL,
r.SEC_EMAIL,
r.PHONE,
r.NAME
from STUDENTS.RECORDS r
inner join STUDENTS.INFO i ON i.ID = r.ID
where I.ROLL_NO = '554';
I think the source of your error query block has incorrect number of result columns is coming from trying to union together a table with 4 columns (id, prim_email, sec_email, phone) with 1 column (name).
From your question, I am gathering that you want a single table of id, prim_email, sec_email, phone from students.records and name from students.info.
I think the following query using CTE's might get you (partially) to your final result. You may want to refactor for optimizing performance.
with s_records as ( select * from students.records ),
s_info as ( select * from students.info ),
joined as (
select
s_records.id,
s_records.prim_email,
s_records.sec_email,
s_records.phone,
s_info.name
from s_records
left join s_info
on s_records.roll_no = s_info.roll_no
where roll_np = '554' )
select * from joined
Overall, I think that a join will be part of your solution rather than a union :-)

I need to SELECT a group of columns not null

Basically, i have a table that have a series of columns named:
ATTRIBUTE10, ATTRIBUTE11, ATTRIBUTE12 ... ATTRIBUTE50
I want a query that gives me all the columns from ATTRIBUTE10 to ATTRIBUTE50 not null
As others have commented we aren't exactly sure of your requirements, but if you want a list the UNPIVOT can do that...
SELECT attribute , value
FROM
(SELECT * from YourFile) p
UNPIVOT
(value FOR attribute IN
(attribute1, attribute2, attribute3, etc.)
)AS unpvt
May be you can use where condition for all columns Or use between operator as below.
For All Columns
where ATTRIBUTE10 is not null and ATTRIBUTE11 is not null ...... and ATTRIBUTE50 is not null
By using between operator
where ATTRIBUTE10 between ATTRIBUTE11 and ATTRIBUTE50
One way to approach the problem is to unfold your table-with-a-zillion-like-named-attributes into one in which you've got one attribute per row, with appropriate foreign keys back to the original table. So something like:
CREATE TABLE ATTR_TABLE AS
SELECT ID_ATTR, ID_TABLE_WITH_ATTRS, ATTR
FROM (SELECT ((ID_TABLE_WITH_ATTRS-1)*100)+1 AS ID_ATTR, ID_TABLE_WITH_ATTRS, ATTRIBUTE10 AS ATTR FROM TABLE_WITH_ATTRS UNION ALL
SELECT ((ID_TABLE_WITH_ATTRS-1)*100)+2, ID_TABLE_WITH_ATTRS, ATTRIBUTE11 FROM TABLE_WITH_ATTRS UNION ALL
SELECT ((ID_TABLE_WITH_ATTRS-1)*100)+3, ID_TABLE_WITH_ATTRS, ATTRIBUTE12 FROM TABLE_WITH_ATTRS);
This only unfolds ATTRIBUTE10, ATTRIBUTE11, and ATTRIBUTE12, but you should be able to get the idea - the rest of the attributes just requires a little cut-n-paste on your part.
You can then query this table to find your non-NULL attributes as
SELECT *
FROM ATTR_TABLE
WHERE ATTR IS NOT NULL
ORDER BY ID_ATTR
Hopefully the difficulty you're encountering in dealing with this table-with-a-zillion-repeated-fields teaches you a hard lesson about exactly why tables with repeated fields or groups of fields are a Bad Idea.
dbfiddle here

Writing a single UPDATE statement that prevents duplicates

I've been trying for a few hours (probably more than I needed to) to figure out the best way to write an update sql query that will dissallow duplicates on the column I am updating.
Meaning, if TableA.ColA already has a name 'TEST1', then when I'm changing another record, then I simply can't pick a value for ColA to be 'TEST1'.
It's pretty easy to simply just separate the query into a select, and use a server layer code that would allow conditional logic:
SELECT ID, NAME FROM TABLEA WHERE NAME = 'TEST1'
IF TableA.recordcount > 0 then
UPDATE SET NAME = 'TEST1' WHERE ID = 1234
END IF
But I'm more interested to see if these two queries can be combined into a single query.
I am using Oracle to figure things out, but I'd love to see a SQL Server query as well. I figured a MERGE statement can work, but for obvious reasons you can't have the clause:
..etc.. WHEN NOT MATCHED UPDATE SET ..etc.. WHERE ID = 1234
AND you can't update a column if it's mentioned in the join (oracle limitation but not limited to SQL Server)
ALSO, I know you can put a constraint on a column that prevents duplicate values, but I'd be interested to see if there is such a query that can do this without using constraint.
Here is an example start-up attempt on my end just to see what I can come up with (explanations on it failed is not necessary):
ERROR: ORA-01732: data manipulation operation not legal on this view
UPDATE (
SELECT d.NAME, ch.NAME FROM (
SELECT 'test1' AS NAME, '2722' AS ID
FROM DUAL
) d
LEFT JOIN TABLEA a
ON UPPER(a.name) = UPPER(d.name)
)
SET a.name = 'test2'
WHERE a.name is null and a.id = d.id
I have tried merge, but just gave up thinking it's not possible. I've also considered not exists (but I'd have to be careful since I might accidentally update every other record that doesn't match a criteria)
It should be straightforward:
update personnel
set personnel_number = 'xyz'
where person_id = 1001
and not exists (select * from personnel where personnel_number = 'xyz');
If I understand correctly, you want to conditionally update a field, assuming the value is not found. The following query does this. It should work in both SQL Server and Oracle:
update table1
set name = 'Test1'
where (select count(*) from table1 where name = 'Test1') > 0 and
id = 1234

Select distinct values from multiple columns in same table

I am trying to construct a single SQL statement that returns unique, non-null values from multiple columns all located in the same table.
SELECT distinct tbl_data.code_1 FROM tbl_data
WHERE tbl_data.code_1 is not null
UNION
SELECT tbl_data.code_2 FROM tbl_data
WHERE tbl_data.code_2 is not null;
For example, tbl_data is as follows:
id code_1 code_2
--- -------- ----------
1 AB BC
2 BC
3 DE EF
4 BC
For the above table, the SQL query should return all unique non-null values from the two columns, namely: AB, BC, DE, EF.
I'm fairly new to SQL. My statement above works, but is there a cleaner way to write this SQL statement, since the columns are from the same table?
It's better to include code in your question, rather than ambiguous text data, so that we are all working with the same data. Here is the sample schema and data I have assumed:
CREATE TABLE tbl_data (
id INT NOT NULL,
code_1 CHAR(2),
code_2 CHAR(2)
);
INSERT INTO tbl_data (
id,
code_1,
code_2
)
VALUES
(1, 'AB', 'BC'),
(2, 'BC', NULL),
(3, 'DE', 'EF'),
(4, NULL, 'BC');
As Blorgbeard commented, the DISTINCT clause in your solution is unnecessary because the UNION operator eliminates duplicate rows. There is a UNION ALL operator that does not elimiate duplicates, but it is not appropriate here.
Rewriting your query without the DISTINCT clause is a fine solution to this problem:
SELECT code_1
FROM tbl_data
WHERE code_1 IS NOT NULL
UNION
SELECT code_2
FROM tbl_data
WHERE code_2 IS NOT NULL;
It doesn't matter that the two columns are in the same table. The solution would be the same even if the columns were in different tables.
If you don't like the redundancy of specifying the same filter clause twice, you can encapsulate the union query in a virtual table before filtering that:
SELECT code
FROM (
SELECT code_1
FROM tbl_data
UNION
SELECT code_2
FROM tbl_data
) AS DistinctCodes (code)
WHERE code IS NOT NULL;
I find the syntax of the second more ugly, but it is logically neater. But which one performs better?
I created a sqlfiddle that demonstrates that the query optimizer of SQL Server 2005 produces the same execution plan for the two different queries:
If SQL Server generates the same execution plan for two queries, then they are practically as well as logically equivalent.
Compare the above to the execution plan for the query in your question:
The DISTINCT clause makes SQL Server 2005 perform a redundant sort operation, because the query optimizer does not know that any duplicates filtered out by the DISTINCT in the first query would be filtered out by the UNION later anyway.
This query is logically equivalent to the other two, but the redundant operation makes it less efficient. On a large data set, I would expect your query to take longer to return a result set than the two here. Don't take my word for it; experiment in your own environment to be sure!
try something like SubQuery:
SELECT derivedtable.NewColumn
FROM
(
SELECT code_1 as NewColumn FROM tbl_data
UNION
SELECT code_2 as NewColumn FROM tbl_data
) derivedtable
WHERE derivedtable.NewColumn IS NOT NULL
The UNION already returns DISTINCT values from the combined query.
Union is applied wherever the row data required is similar in terms of type, values etc. It doesnt matter you have column in the same table or the other to retrieve from as the results would remain the same ( in one of the above answers already mentioned though).
As you didn't wanted duplicates theres no point using UNION ALL and use of distinct is simply unnecessary as union gives distinct data
Can create a view would be best choice as view is a virtual representation of the table. Modifications could be then done neatly on that view created
Create VIEW getData AS
(
SELECT distinct tbl_data.code_1
FROM tbl_data
WHERE tbl_data.code_1 is not null
UNION
SELECT tbl_data.code_2
FROM tbl_data
WHERE tbl_data.code_2 is not null
);
Try this if you have more than two Columns:
CREATE TABLE #temptable (Name1 VARCHAR(25),Name2 VARCHAR(25))
INSERT INTO #temptable(Name1, Name2)
VALUES('JON', 'Harry'), ('JON', 'JON'), ('Sam','harry')
SELECT t.Name1+','+t.Name2 Names INTO #t FROM #temptable AS tSELECT DISTINCT ss.value FROM #t AS t
CROSS APPLY STRING_SPLIT(T.Names,',') AS ss