Simple WHERE clause but keep extracted rows and fill them will null values

Simple WHERE clause but keep extracted rows and fill them will null values - sql

I have a table which basically looks like this one:
Date | Criteria
12-04-2016 123
12-05-2016 1234
...
Now I want to select those rows with values in the column 'Criteria' within a given range but I want to keep the extracted rows. The extracted rows should get the value 'null' for the column 'Criteria'. So for example, if I want to select the row with 'Criteria = 123' my result should look like this:
Date | Criteria
12-04-2016 123
12-05-2016 null
Currently I am using this query to get the result:
SELECT b.date, a.criteria
FROM (SELECT id, date, criteria FROM ABC WHERE criteria > 100 and criteria < 200) a
FULL OUTER JOIN ABC b ON a.id = b.id ORDER BY a.criteria
Someone told me that full outer joins perform very badly. Plus my table has like 400000 records and the query is used pretty often. So anyone has an idea to speed up my query? Btw I am using the Oracle11g database.

Do you just want a case expression?
SELECT date,
(case when criteria > 100 and criteria < 200 then criteria end) as criteria
FROM ABC;

Related

The nearest row in the other table

One table is a sample of users and their purchases.
Structure:
Email | NAME | TRAN_DATETIME (Varchar)
So we have customer email + FirstName&LastName + Date of transaction
and the second table that comes from second system contains all users, they sensitive data and when they got registered in our system.
Simplified Structure:
Email | InstertDate (varchar)
My task is to count minutes difference between the rows insterted from sale(first table)and the rows with users and their sensitive data.
The issue is that second table contain many rows and I want to find the nearest in time row that was inserted in 2nd table, because sometimes it may be a few minutes difeerence(delay or opposite of delay)and sometimes it can be a few days.
So for x email I have row in 1st table:
E_MAIL NAME TRAN_DATETIME
p****#****.eu xxx xxx 2021-10-04 00:03:09.0000000
But then I have 3 rows and the lastest is the one I want to count difference
Email InstertDate
p****#****.eu 2021-05-20 19:12:07
p****#****.eu 2021-05-20 19:18:48
p****#****.eu 2021-10-03 18:32:30 <--
I wrote that some query, but I have no idea how to match nearest row in the 2nd table
SELECT DISTINCT TOP (100)
,a.[E_MAIL]
,a.[NAME]
,a.[TRAN_DATETIME]
,CASE WHEN b.EMAIL IS NOT NULL THEN 'YES' ELSE 'NO' END AS 'EXISTS'
,(ABS(CONVERT(INT, CONVERT(Datetime,LEFT(a.[TRAN_DATETIME],10),120))) - CONVERT(INT, CONVERT(Datetime,LEFT(b.[INSERTDATE],10),120))) as 'DateAccuracy'
FROM [crm].[SalesSampleTable] a
left join [crm].[SensitiveTable] b on a.[E_MAIL]) = b.[EMAIL]

Totally untested: I'd need sample data and database the area of suspect is the casting of dates and the datemath.... since I dont' know what RDBMS and version this is.. consider the following "pseudo code".
We assign a row number to the absolute difference in seconds between the dates those with rowID of 1 win.
WTIH CTE AS (
SELECT A.*, B.* row_number() over (PARTITION BY A.e_mail
ORDER BY abs(datediff(second, cast(Tran_dateTime as Datetime), cast(InsterDate as DateTime)) desc) RN
FROM [crm].[SalesSampleTable] a
LEFT JOIN [crm].[SensitiveTable] b
on a.[E_MAIL] = b.[EMAIL])
SELECT * FROM CTE WHERE RN = 1

SQL simplifying an except query

I have a database with around 50 million entries showing the status of a device for a given day, simplified to the form:
id | status
-------------
1 | Off
1 | Off
1 | On
2 | Off
2 | Off
3 | Off
3 | Off
3 | On
...
such that each id is guaranteed to have at least 2 rows with an 'off' status, but doesn't have to have an 'on' status. I'm trying to get a list of only the ids that do not have an 'On' status. For example, in the above data set I'd want a query returned with only '2'
The current query is:
SELECT DISTINCT id FROM table
EXCEPT
SELECT DISTINCT id FROM table WHERE status <> 'Off'
Which seems to work, but it's having to iterate over the entire table twice which ends up taking ~10-12 minutes to run per query. Is there a simpler way to do this with only a single query?

You can use WHERE NOT EXISTS instead:
Select Distinct Id
From Table A
Where Not Exists
(
Select *
From Table B
Where A.Id = B.Id
And B.Status = 'On'
)
I would also recommend looking at the indexes on the Status column. 10-12 minutes to run is excessively long. Even with 50m records, with proper indexing, a query like this shouldn't take longer than a second.
To add an index to the column, you can run this (I'm assuming SQL Server, your syntax may vary):
Create NonClustered Index Ix_YourTable_Status On YourTable (Status Asc);

You can use conditional aggregation.
select id
from table
group by id
having count(case when status='On' then 1 end)=0

You can use the help of a SELF JOIN ..
SELECT DISTINCT A.Id
FROM Table A
LEFT JOIN Table B ON A.Id=B.Id
WHERE B.Status='On'
AND B.Id IS NULL

Adding in missing dates from results in SQL

I have a database that currently looks like this
Date | valid_entry | profile
1/6/2015 1 | 1
3/6/2015 2 | 1
3/6/2015 2 | 2
5/6/2015 4 | 4
I am trying to grab the dates but i need to make a query to display also for dates that does not exist in the list, such as 2/6/2015.
This is a sample of what i need it to be:
Date | valid_entry
1/6/2015 1
2/6/2015 0
3/6/2015 2
3/6/2015 2
4/6/2015 0
5/6/2015 4
My query:
select date, count(valid_entry)
from database
where profile = 1
group by 1;
This query will only display the dates that exist in there. Is there a way in query that I can populate the results with dates that does not exist in there?

You can generate a list of all dates that are between the start and end date from your source table using generate_series(). These dates can then be used in an outer join to sum the values for all dates.
with all_dates (date) as (
select dt::date
from generate_series( (select min(date) from some_table), (select max(date) from some_table), interval '1' day) as x(dt)
)
select ad.date, sum(coalesce(st.valid_entry,0))
from all_dates ad
left join some_table st on ad.date = st.date
group by ad.date, st.profile
order by ad.date;
some_table is your table with the sample data you have provided.
Based on your sample output, you also seem to want group by date and profile, otherwise there can't be two rows with 2015-06-03. You also don't seem to want where profile = 1 because that as well wouldn't generate two rows with 2015-06-03 as shown in your sample output.
SQLFiddle example: http://sqlfiddle.com/#!15/b0b2a/2
Unrelated, but: I hope that the column names are only made up. date is a horrible name for a column. For one because it is also a keyword, but more importantly it does not document what this date is for. A start date? An end date? A due date? A modification date?

You have to use a calendar table for this purpose. In this case you can create an in-line table with the tables required, then LEFT JOIN your table to it:
select "date", count(valid_entry)
from (
SELECT '2015-06-01' AS d UNION ALL '2015-06-02' UNION ALL '2015-06-03' UNION ALL
'2015-06-04' UNION ALL '2015-06-05' UNION ALL '2015-06-06') AS t
left join database AS db on t.d = db."date" and db.profile = 1
group by t.d;
Note: Predicate profile = 1 should be applied in the ON clause of the LEFT JOIN operation. If it is placed in the WHERE clause instead then LEFT JOIN essentially becomes an INNER JOIN.

Update with results of another sql

With the sql below I count how many records I have in tableB for each code. The total field is assigned the result of the count and the code the code field of the record.
SELECT
"count" (*) as total,
tableB."code" as code
FROM
tableB
WHERE
tableB.code LIKE '%1'
GROUP BY
tableB.code
In tableA I have a sequence field and I update with the result of total (obtained in the previous sql) plus 1 Do this for each code.
I tried this and it did not work, can someone help me?
UPDATE tableA
SET tableA.sequence = (tableB.total + 1) where tableA."code" = tableB.code
FROM
(
SELECT
"count" (*) as total,
tableB."code" as code
FROM
tableB
WHERE
tableB.code LIKE '%1'
GROUP BY
tableB.code
)
I edited for my tables are as mostar believe facillita understanding of my need
tableA
code sequence
100 null
200 null
table B
code sequence
100 1
100 2
100 3
100 4
......
100 17
200 1
200 2
200 3
200 4
......
200 23
Need to update the sequence blank field in tableA with the number 18 to code = 100
Need to update the sequence blank field in tableA with the number 24 to code = 200

This assumes that code is unique in table_a:
with max_seq as (
select code,
max(sequence) + 1 as max_seq
from table_b
group by code
)
update table_a
set sequence = ms.max_seq
from max_seq ms
where table_a.code = ms.code;
SQLFiddle example: http://sqlfiddle.com/#!15/745a7/1

UPDATE tbl_a a
SET sequence = b.next_seq
FROM (
SELECT code, max(sequence) + 1 AS next_seq
FROM tbl_b
GROUP BY code
) b
WHERE a.code = b.code;
SQL Fiddle.
Only columns of the target table can be updated. It would not make sense to table-qualify those. Consequently, this is not allowed.
Every subquery must have a table alias for the derived table.
I would not use a CTE for a simple UPDATE like this. A subquery in the FROM clause is typically simpler and faster.
No point in double-quoting the aggregate function count(). No pint in double-quoting perfectly legal, lower case identifiers, either. No point in table-qualifying columns in a subquery on a single table in a plain SQL command (no harm either).
You don't need a WHERE condition, since you want to UPDATE all rows (as it seems). Note that only row with matching code are updated. Other rows in tbl_b remain untouched.
Basically you need to read the manual about UPDATE before you try any of this.

Grouping by intervals

Given a table (mytable) containing a numeric field (mynum), how would one go about writing an SQL query which summarizes the table's data based on ranges of values in that field rather than each distinct value?
For the sake of a more concrete example, let's make it intervals of 3 and just "summarize" with a count(*), such that the results tell the number of rows where mynum is 0-2.99, the number of rows where it's 3-5.99, where it's 6-8.99, etc.

The idea is to compute some function of the field that has constant value within each group you want:
select count(*), round(mynum/3.0) foo from mytable group by foo;

I do not know if this is applicable to mySql, anyway in SQL Server I think you can "simply" use group by in both the select list AND the group by list.
Something like:
select
CASE
WHEN id <= 20 THEN 'lessthan20'
WHEN id > 20 and id <= 30 THEN '20and30' ELSE 'morethan30' END,
count(*)
from Profiles
where 1=1
group by
CASE
WHEN id <= 20 THEN 'lessthan20'
WHEN id > 20 and id <= 30 THEN '20and30' ELSE 'morethan30' END
returns something like
column1 column2
---------- ----------
20and30 3
lessthan20 3
morethan30 13

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Simple WHERE clause but keep extracted rows and fill them will null values - sql

Do you just want a case expression? SELECT date, (case when criteria > 100 and criteria < 200 then criteria end) as criteria FROM ABC;

Related

The nearest row in the other table

SQL simplifying an except query

Adding in missing dates from results in SQL

Update with results of another sql

Grouping by intervals

Categories

Resources