I need to retrieve a column from already retrieved column from a table - sql

I had a table which has a column like this which i retrieved from this query
select distinct HDD_WP_RPTNG_AS_OF_SID
from wcadbo.WCA_MDW_D_HLDNGS_DATE
order by HDD_WP_RPTNG_AS_OF_SID desc;
Table:
HDD_WP_RPTNG_AS_OF_SID
20210501
20210430
20210429
20210428
It contains dates in integer format.
I wrote a query to retrieve another column of these dates in date format and I named column as AS_OF_DATE - like this:
SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC;
Result set:
HDD_WP_RPTNG_AS_OF_SID AS_OF_DATE
----------------------------------
20210501 01-MAY-21
20210430 30-APR-21
20210429 29-APR-21
20210428 28-APR-21
Now I need another column as Display_Date in char type which gives LastAvailableDate for latest date in previous column or gives Date in char type for all other dates like this
I wrote this query but not working:
SELECT
HDD_WP_RPTNG_AS_OF_SID,
AS_OF_DATE,
Display_date
FROM
(SELECT DISTINCT
HDD_WP_RPTNG_AS_OF_SID,
to_date(HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE
ORDER BY
HDD_WP_RPTNG_AS_OF_SID DESC)
WHERE
Display_Date = (CASE
WHEN AS_OF_DATE = '01-MAY-21'
THEN 'Last_Available_date'
ELSE TO_CHAR(AS_OF_DATE, 'MON DD YYYY')
END);
Finally I need three columns, one is already in table but modified a bit. Other two are temporary ones(AS_OF_DATE and Display_Date) that i need to retrieve.
I'm a beginner in SQL and couldn't figure out how to retrieve column from another temporary column..
Kindly help, Thank you.
BTW I was doing it in Oracle SQL Developer

It looks like you want something like this
SELECT
subQ.HDD_WP_RPTNG_AS_OF_SID,
subQ.AS_OF_DATE,
(CASE
WHEN subQ.AS_OF_DATE = date '2021-05-01'
THEN 'Last_Available_date'
ELSE TO_CHAR(subQ.AS_OF_DATE, 'MON DD YYYY')
END) Display_date
FROM
(SELECT DISTINCT
tbl.HDD_WP_RPTNG_AS_OF_SID,
to_date(tbl.HDD_WP_RPTNG_AS_OF_SID,'YYYYMMDD') AS_OF_DATE
FROM
WCADBO.WCA_MDW_D_HLDNGS_DATE tbl) subQ
ORDER BY
subQ.HDD_WP_RPTNG_AS_OF_SID DESC
Comments
If you want to add computed columns, that is done in the projection (the select list)
Always compare dates to dates and strings to strings. So in your case statement, compare the date as_of_date against another date. In this case I'm using a date literal. You could also call to_date on a string parameter.
If you want the results of the query ordered by a particular column, you want that order by applied at the outermost layer of the query, not in an inline view.
You basically always want to use aliases when referring to any column in a query. It's less critical in situations where everything is coming from one table but as soon as you start referencing multiple tables in a query, it becomes annoying to look at a query and not sure where a column is coming from. Even in a query like this where there is an inline view and an outer query, it makes it easier to read the query if you're explicit about where the columns are coming from.
Do you really need the distinct? I kept it because it was a part of the original query but I get antsy whenever I see a distinct particularly from people learning SQL. Doing a distinct is a relatively expensive operation and it is very commonly used to cover up an underlying issue in the query (i.e. that you're getting multiple rows because some other column you aren't showing has multiple values) that ought to be addressed correctly (i.e. by adding an additional predicate to ensure that you're only getting each hdd_wp_rptg_as_of_sid once).
Storing dates as strings in tables (as is done apparently with hdd_wp_rptg_as_of_sid) is a really bad practice. If one person writes one row to the table where the string isn't in the right format, your query will suddenly stop working and start throwing errors, for example.

Related

Sorting rows in cross query Access SQL

I have a cross query in Access with an SQL:
TRANSFORM Tab1.Income AS Income
SELECT Tab1.Month
FROM Tab1
GROUP BY Tab1.Month
PIVOT Tab1.Group;
Months are string, not a number, and are in alphabetical order. I want to sort them manually.
In the normal query, I used a Switch() function which works perfectly. But in the cross query I've got an alert that ORDER BY and GROUP BY are mutually exclusive.
I would be grateful for any idea, how to sort them in query. However, if it's not possible, maybe they can be sorted in the report because this is more important.
Options:
calculate month number field to use as primary group criteria, include it along with month name as Row Headers and if this is a multi-year dataset perhaps should also include year as a Row Header - hopefully there is a full date field available to extract date parts from
report design can use expression to dictate sort order
Month is a reserved word and advise not to use reserved words as object names.

What does SELECT Function is SQL actually produce? Does it produce a new table by default?

I am struggling to understand what the output of SELECT is meant to be in SQL (I am using MS ACCESS), and what sort of criteria this output needs to specify, if any. As a result, I don't understand why some queries work and others don't. So I know it retrieves data from a table, does calculations with it and displays it. But I don't understand the "inner" working of SELECT function. For instance, what is the name of data structure / entity it displays? Is it a "new" table?
And for example, suppose I have a table called "table_name", with 5 columns. One of the columns called "column_3", and there are 20 records.
SELECT column_3, COUNT(*) AS Count
FROM table_name;
Why does this query fail to run? By logic, I would expect it to display two columns: first column will be "column_3", containing 20 rows with relevant data, and second column will be "Count", containing just one non-empty row (displaying 20), and other 19 rows will be empty (or NULL maybe)?
Is it because SELECT is meant to produce equal number of rows for each column?
Your questions involve a basic understanding of SQL. SELECT statements do not create tables, but instead return virtual result sets. Nothing is persisted unless you change it to an INSERT.
In your example question, you will need to "tell" the SQL engine what you want a count "of". Because you added column_3, you need to write:
SELECT column_3, COUNT(*) AS Count
FROM table_name
GROUP BY column_3
If you wanted a count of all the rows, simply:
SELECT COUNT(*) FROM table_name

Optimal SQL query for querying 31 tables (containing datestamp in tablename)

Fairly new to SQL and I was stumped on this question I received in an interview recently.
The question was along the lines of how would you count the total occurrences of 'True' for Column B in July.
Problem was; there was no date or timestamp column in the table. Instead the table naming convention was defined as "ProductX_YYYYMMDD". The assumption being that a new table is created for each day's data dump.
Is there an efficient query I can write to obtain the True COUNTs of Column B for each table (which doesn't involve ~30 JOIN or UNION statements to get the answer)?
Use STRING_SPLIT(myColumn, '_')
Then
SELECT RIGHT (SELECT LEFT(tempColumn, -4)), -2)
Now you have a temp table filled with only month |MM| and you can use
COUNT() FROM dailyTable WHERE dailyName like '07'
Add the count of every daily Table to a variable

Get latest data for all people in a table and then filter based on some criteria

I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.

In SQL, why does group by make a difference when using having count()

I have a table that stores zone_id. Sometimes a zone id is twice in the database. I wrote a query to show only entries that have two or more entries of the same zone_id in the table.
The following query returns the correct result:
select *, count(zone_id)
from proxies.storage_used
group by zone_id desc
having count(zone_id) > 1;
However, if I group by last_updated or company_id, it returns random values. If I don't add a group by clause, it only displays one value as per the screenshot below. First output shows above query string, second output shows same query string without the 'group by' line and returns only one value:
correction: I'm a new member and thus can't post pictures directly, so I added it on minus: http://min.us/m3yrlkSMu#1o
While my query works, I don't understand why. Can somebody help me understand why group by is altering the actual output, instead of only the grouping of the output? I am using MySQL.
A group by divides the resulting rows into groups and performs the aggregate function on the records in each group. If you do a count(*) without a group by you will get a single count of all rows in a table. Since you didn't specify a group by there is only one group, all records in the table. If you do a count(*) with a group by of zone id, you will get a count of how many records there are for each zone id. If you do a count(*) of zone id and last updated date, you will get a count of how many rows were updated on each date in each zone.
Without a group by clause, everything is stored in the same group, so you get a single result. If there are more than one row in your table, then the having will succeed. So, you'll end up counting all the rows in your table...
source
From what I got, you could create a query with having and without group by only in two situations:
You have a where clause, and you want to test a condition on an aggregation of all rows that satisfy that clause.
Same as above, but for all rows in your table (in practice, it doesn't make sense, though).