I'm trying to pull data from an access database into an excel table with an SQL query. The problem is that my access database has columns with similar data that I want to combine into one single column. This should give me duplicates of the data in other columns for each entry. I'm not great with SQL but I think I have the basics down.
Database structure that I have:
Date | Product | Hours 1 | Reason 1 | Hours 2 | Reason 2 |
2019 A 3 "xxx" 5 "yyy"
Excel table that I want:
Date | Product | Hours | Reason |
2019 A 3 "xxx"
2019 A 5 "yyy"
Also not sure if it's possible but it would be great to see the source column of each
Date | Product | Hours | Reason | Source |
2019 A 3 "xxx" "Hours 1"
2019 A 5 "yyy" "Hours 2"
I've tried UNION ALL and got duplicates of the data but not merged into one column. I'm about to try INSERT INTO but sort of lost on how to get each one into the same column
Try this
SELECT Date, Product, Hours, Reason, Source
FROM (
SELECT Date, Product, Hours1 Hours, Reason1 Reason, "Hours 1" Source
FROM Table
UNION
SELECT Date, Product, Hours2, Reason2, "Hours 2"
FROM Table
)
It looks like you have a bad data structure in the table. By that I mean its a "flat" table with multiple hours in one row for a record. This is generally a PITA when it comes doing tasks for reviewing data in many to one situations. Normally there would be a table where records get logged separately for each hour involved. I understand you probably didnt build it, but its worth pointing out for you own information.
Fundamentally, this issue would be easier to appproach once you understood how that less than desirable structure affects what youre task is. This is essentially, in my mind, a pivot problem. PIVOT in SQL is essentially switching rows and columns. There are may ways to pivot data with code - pick your favorite - most people actually use the function PIVOT, where I tend to teeter between CTE's (common table experessions) and PIVOT. IMO CTE's are easier to read once you understand them. Because Acess SQL doesnt support PIVOT or CTE's we just had to treat the body of what a cte wouldve been as a correlated subquery.
SELECT x.*
FROM
(
SELECT
Date,
Product,
Hours,
Reason,
[Hours 1] AS Source
FROM yourTableName
UNION
SELECT
Date,
Product,
Hours,
Reason,
[Hours 2] AS Source
FROM yourTableName
) x
Related
I am relatively new to SQL...
I am creating a summary of returned items and I would like the finished result to show the item code, the amount returned (SUM) and the reason for return. So Ideally it would be something like this:
101 - Blue Widget | 13 | Shipment Lost
101 - Blue Widget | 3 | Damaged in Transit
102 - Red Widget | 5 | Shipment Lost
So it is grouping by ITEM and RMACODE and summing the quantities
Here is a simplified version of the query I wrote for this
Select ITEM, SUM(QUANTITY), RMACODE, DATEENTERED
FROM RMAITEMS
group by ITEM, Quantity, RMACODE
I am loading this in SSRS and need DATENETERED for my report parameters to only pull records between #StartDate and #EndDate. I get en error saying DATEENTERED is invalid because it is not in the GROUP BY.
Is there a better/different way to acheive the result I am looking for?
Thanks
Andrew
I made the changes suggested by edkloczko and it appeared everything would work then, but since we removed the date from the select statement I am unable to use it in my report parameters. Here is a screenshot. I have a few ideas I will try out today but if anyone has already climbed this hill and can help me with directions I would be grateful.
Expression Needed is Absent
If you're looking to filter by date and don't actually need the date field...
SELECT ITEM, SUM(QUANTITY), RMACODE
FROM RMAITEMS
WHERE DATEENTERED>=STARTDATE AND DATEENTERED<=ENDDATE
GROUP BY ITEM, QUANTITY, RMACODE
This will give you all the records you need and makes the extra filtering step you're doing unnecessary - it will only select the records between the start and end dates.
I've run into the same issue before with our IBM DB2. As far as I know you need to specify ALL of the SELECT items in the GROUP BY statement. Unsure if this is specific to certain databases or not.
I am querying a table in a PG database which contains a period (character varying(255)) and a value (integer), which looks something like:
|period|value|
|Months|3 |
|Months|6 |
|Weeks |1 |
|Years |5 |
After a few joins, I'm looking to subset my result set to only include a subset of these period / value combinations, for example I may only want 3 Months, 6 Months and 5 Years (so not 1 Weeks).
I'd usually reach to a WHERE IN(..) but don't think I can do this across two columns. Instead I've tried to make a composite column by:
CURRENT_DATE + CAST(CONCAT(tbl.value, tbl.period) AS INTERVAL)
Producing a column of timestamps which I can then subset with an IN('2019-05-18', '2019-08-18', '2024-02-18').
This works but isn't particularly pretty or efficient. Is there a better way?
I'm free to change my query (so I can subset by dates as I currently am, or by 3 and Months) but importantly I do not know ahead of time whether 2 Years will be stored as 24 Months (nor do I have control of the table).
Thanks!
You can say
WHERE (period, value) IN (('months', 5), ...)
and use an index over both columns.
I hope I got the syntax right; there might be a ROW missing somewhere.
This question already has answers here:
How to Pivot table in BigQuery
(7 answers)
Closed 2 years ago.
I am working with BigQuery. I have two tables:
organisations:
org_code STRING
name STRING
spending:
org STRING
month DATE
quantity INTEGER
code STRING
And then quite a complicated query to get results by each organisation, by month:
SELECT
organisations.org_code AS org,
num.month AS month,
(num.quantity / denom.quantity) AS ratio_quantity
FROM (
SELECT
org_code, name
FROM
[mytable.organisations]) AS organisations
LEFT OUTER JOIN EACH (
SELECT
org,
month,
SUM(quantity) AS quantity
FROM
[mytable.spending]
GROUP BY
org,
month) AS denom
ON
denom.org = organisations.org_code
LEFT OUTER JOIN EACH (
SELECT
org,
month,
SUM(quantity) AS quantity
FROM
[hscic.spending]
WHERE
code LIKE 'XXXX%'
GROUP BY
org,
month) AS num
ON
denom.month = num.month
AND denom.org = num.org
ORDER BY org, month
My final results look like this, with a row per org/month combination:
org,month,ratio_quantity
A81001,2015-10-01 00:00:00 UTC,28
A82001,2015-11-01 00:00:00 UTC,43
A82002,2015-10-01 00:00:00 UTC,16
Now I would like to pivot the results to look like this, with one row per month, and one column per organisation:
month,items.A81001,items.A82002...
2015-10-01 00:00:00 UTC,28,16
2015-11-01 00:00:00 UTC,43,...
Is this possible in the same BigQuery call? Or should I create a new table and pivot it from there? Or should I just do the reshaping in Python?
UPDATE: There are about 500,000 results, fyi.
Q. Is this possible in the same BigQuery call? Or should I create a new
table and pivot it from there?
In general, you can use that “complicated query” as a subquery for extra logic to be applied to your current result.
So, it is definitely doable. But code can quickly become un-manageable or hard to manage – so you can consider writing this result into new table and then pivot it from there
If you stuck with direction of doing pivot (the way you described in your question) - check below link to see detailed intro on how you can implement pivot within BigQuery.
How to scale Pivoting in BigQuery?
Please note – there is a limitation of 10K columns per table - so you are limited with 10K organizations.
You can also see below as simplified examples (if above one is too complex/verbose):
How to transpose rows to columns with large amount of the data in BigQuery/SQL?
How to create dummy variable columns for thousands of categories in Google BigQuery?
Pivot Repeated fields in BigQuery
Q. Or should I just do the reshaping in Python?
If above will not work for you – pivoting on client is always an option but now you should consider client side limitations
Hope this helped!
I tried searching for an answer to this question...I may not be wording my search correctly as I am not a super guru in SQL.
Situation:
Microsoft SQL Server 2008 R2 database, two tables I'm interested in right now, call them OpenOrders and InvoicedOrders.
I want to pull OpenOrders for the month, quarter, and year, and then InvoicedOrders for the month, quarter, and year, grouped by sales zone (sales zone in the same table).
I can't post an image, but if you imagine we have 5 sales zone, and then the 6 date ranges noted above, there would be 7 rows and 5 columns in the query result. shown in text below if displays correctly.
1 10000 40000 12500 53200 12500 61180
2 23000 53000 25500 70490 25500 81063.5
3 45000 75000 47500 99750 47500 114712.5
4 43000 73000 45500 97090 45500 111653.5
5 76000 106000 78500 140980 78500 162127
What I want to do is a solution that is ideally one query, or a few queries, not 6 queries. I will be using this query in an SSRS report and was not successful with nested queries as those queries returned the 'returned more than one result' error.
I am now thinking of using a temp table to select the first row, insert into temp table, select second row, insert into temp table, then select all results from temp table and drop temp table.
Hope I provided enough info!
Is a temp table an ideal solution, or is there a better one out there?
Thanks for any help!
If you are trying to get one result set back, consider creating a view and use UNION to join the results for these different queries inside. You can then run a select to get results from the view like a table.
Wondering if anyone can help with the code for this.
I want to query the data and get 2 entries, one for YTD previous year and one for this year YTD.
Only way I know how to do this is as 2 separate queries with where clauses.. I would prefer to not have to run the query twice.
One column called DatePeriod and populated with 2011 YTD and 2012YTD, would be even better if I could get it to do 2011YTD, 2012YTD, 2011Total, 2012Total... though guessing this is 4 queries.
Thanks
EDIT:
In response to help clear a few things up:
This is being coded in MS SQL.
The data looks like so: (very basic example)
Date | Call_Volume
1/1/2012 | 4
What I would like is to have the Call_Volume summed up, I have queries that group it by week, and others that do it by month. I could pull all the dailies in and do this in Excel but the table has millions of rows so always best to reduce the size of my output.
I currently group by Week/Month and Year and union all so its 1 output. But that means I have 3 queries accessing the same table, large pain, very slow not efficient and that is fine but now I also need a YTD so its either 1 more query or if I could find a way to add it to the yearly query that would ideal:
So
DatePeriod | Sum_Calls
2011 Total | 40
2011 YTD | 12
2012 Total | 45
2012 YTD | 15
Hope this makes any sense.
SQL is built to do operations on rows, not columns (you select columns, of course, but aggregate operations are all on rows).
The most standard approach to this is something like:
SELECT SUM(your_table.sales), YEAR(your_table.sale_date)
FROM your_table
GROUP BY YEAR(your_table.sale_date)
Now you'll get one row for each year on record, with no limit to how many years you can process. If you're already grouping by another field, that's fine; you'll then get one row for each year in each of those groups.
Your program can then iterate over the rows and organize/render them however you like.
If you absolutely, positively must have columns instead, you'll be stuck with something like this:
SELECT SUM(IF(YEAR(date) = 2011, sales, 0)) AS total_2011,
SUM(IF(YEAR(date) = 2012, total_2012, 0)) AS total_2012
FROM your_table
If you're building the query programmatically you can add as many of those column criteria as you need, but I wouldn't count on this running very efficiently.
(These examples are written with some MySQL-specific functions. Corresponding functions exist for other engines but the syntax would be a little different.)