SQL query, select from 2 tables random

SQL query, select from 2 tables random - sql

Hello all i have a problem that i just CANT get to work like i what it..
i want to show news and reviews (2 tables) and i want to have random output and not the same output
here is my query i really hope some one can explain me what i do wrong
SELECT
anmeldelser.billed_sti ,
anmeldelser.overskrift ,
anmeldelser.indhold ,
anmeldelser.id ,
anmeldelser.godkendt
FROM
anmeldelser
LIMIT 0,6
UNION ALL
SELECT
nyheder.id ,
nyheder.billed_sti ,
nyheder.overskrift ,
nyheder.indhold ,
nyheder.godkendt
FROM nyheder
ORDER BY rand() LIMIT 0,6

First off it looks like the column order for the two SELECT statements don't match which they need to for a UNION.
What does the following return?
SELECT
anmeldelser.billed_sti ,
anmeldelser.overskrift ,
anmeldelser.indhold ,
anmeldelser.id ,
anmeldelser.godkendt
FROM
anmeldelser
LIMIT 0,6
UNION ALL
SELECT
nyheder.billed_sti ,
nyheder.overskrift ,
nyheder.indhold ,
nyheder.id ,
nyheder.godkendt
FROM nyheder
ORDER BY rand() LIMIT 0,6
(which RDBMS are you using? the SQL you have is not valid for Sybase but there may be techniques depending on the 'flavour' of SQL you are using)

Since RAND() appears only in the ORDER BY clause, would it not only be evaluated once for the whole query, and not once per row?

The problem is the first table is not selecting random elements
SELECT temp.* FROM
(
SELECT
anmeldelser.id ,
anmeldelser.billed_sti ,
anmeldelser.overskrift ,
anmeldelser.indhold ,
anmeldelser.godkendt,
'News' as artType
FROM anmeldelser
UNION
SELECT
nyheder.id ,
nyheder.billed_sti ,
nyheder.overskrift ,
nyheder.indhold ,
nyheder.godkendt,
'Review' as artType
FROM nyheder
) temp
ORDER BY rand() LIMIT 0,6

Related

Sql query with group by takes too long

I have a very simple query but it takes too long to load when I use Max and group by. Could you please propose an alternative?. I use Oracle 18g for running this query. a_num_ver, id, site_id is a primary key.
SELECT id
, site_id
, sub_id
, max(a_num_ver) as a_num_ver
, ae_no
, max(aer_ver) AS aer_ver
FROM table_1
GROUP BY id
, site_id
, sub_id
, ae_no

Try using parallel hints 4 OR 8 if that is allowed from DBA. I have tried a similar query in a table with around 296,292,720 rows. Without hints, it took around 2 minutes to execute. It comes down to 20 seconds with PARALLEL 8.
SELECT /*+ PARALLEL(8) */
id
, site_id
, sub_id
, max(a_num_ver) as a_num_ver
, ae_no
, max(aer_ver) AS aer_ver
FROM table_1
GROUP BY id
, site_id
, sub_id
, ae_no

Row_number partition by performance

How to improve the performance when row_number Partitioned by used in Hive query.
select *
from
(
SELECT
'123' AS run_session_id
, tbl1.transaction_id
, tbl1.src_transaction_id
, tbl1.transaction_created_epoch_time
, tbl1.currency
, tbl1.event_type
, tbl1.event_sub_type
, tbl1.estimated_total_cost
, tbl1.actual_total_cost
, tbl1.tfc_export_created_epoch_time
, tbl1.authorizer
, tbl1.acquirer
, tbl1.processor
, tbl1.company_code
, tbl1.country_of_account
, tbl1.merchant_id
, tbl1.client_id
, tbl1.ft_id
, tbl1.transaction_created_date
, tbl1.event_pst_time
, tbl1.extract_id_seq
, tbl1.src_type
, ROW_NUMBER() OVER(PARTITION by tbl1.transaction_id ORDER BY tbl1.event_pst_time DESC) AS seq_num -- while writing back to the pfit events table, write each event so that event_pst_time populates in right way
FROM nest.nest_cost_events tbl1 --<hiveFinalDB>-- -- DB variables wont work, so need to change the DB accrodingly for testing and PROD deployment
WHERE extract_id_seq BETWEEN 275 - 60
AND 275
AND event_type in('ACT','CBR','SKU','CAL','KIT','BXT' )) tbl1
where seq_num=1;
This table is partitioned by src_type.
Now it is taking 20 mnts to process 154M records. I want to reduce to 10 mnts.
Any suggestions ?
Thanks

How to make testing an SQL Query faster?

I was given this code for a SQL query, but for the life of me can't figure out what exactly is going on. I'm pretty new to SQL so any help is greatly appreciated.
SELECT *
FROM ( SELECT rownum as rn
, a.*
FROM ( SELECT outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
FROM MESSAGES outbound
WHERE (1 = 1)
GROUP BY outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
ORDER BY CREATION_DATE DESC ) a
)
WHERE rn BETWEEN 1 AND 25
I'm specifically having touble understanding SELECT rownum as rn, a.* FROM (...a ) but I assume this is where I would edit the query to only check 1000 rows (which is my goal). Right now it's checking all entries in the database (750,000) and I only want it to check 1000 for testing.
Thanks!

Alright, let's start picking this guy apart. Starting with the subquery
SELECT outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
FROM MESSAGES outbound
WHERE (1 = 1)
GROUP BY outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
ORDER BY CREATION_DATE DESC
What's happening there is the subquery is selecting msg_id, msg_type, etc from the table MESSAGES. It's aliasing that table and calling it outbound. FROM MESSAGES outbound means "get the data from MESSAGES but call the table outbound instead."
Now, you might note the WHERE (1=1) clause... that's trivially true, and will always occur. Sometimes people use WHERE (1=1) because a script somewhere adds additional filters if certain parameters are selected. For now don't worry about that.
Last, the GROUP BY {blah blah blah} is telling your database to dedup these data. It's effectively SELECT DISTINCT. Last, the subquery is ordered by Creation_date DESC so the most recent occurrence of a message is the one that is selected. If I had to guess, the deduping and ordering is because this is a messaging system that might contain essentially duplicate records (like maybe someone resent the same email) or because messaging systems are often distributed and don't emphasize consistency on write, but rather write speed. I have no idea why exactly they needed to dedup these guys, but the important thing for you is that someone thought it was necessary and they were probably right.
Outside of the subquery you see
SELECT rownum as rn
, a.*
Everything that the subquery was doing got labelled "a". Remember that alias concept from earlier. Your entire subquery has an alias too, and it's called "a". So, we are selecting everything from a ("a.*") and we are also selecting the rownumber and calling that rn. The where clause at the very end says "give me the first 25 rows."
So... if you want to select 1000 rows in this manner (dedup, keep the most recent, etc) then just change WHERE rn BETWEEN 1 AND 25 to WHERE rn BETWEEN 1 AND 1000.
If, on the other hand, you don't want to dedup messages at all and only want the top 1000 rows of the table, then
SELECT outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
FROM MESSAGES outbound
WHERE ROWNUM <= 1000;
should do the job.
Does this help?

To answer your question you need to determine how early do you want to limit the subset of records your query will be checking for testing.
Also, you have to define what your goal is with testing: Are you looking to do simple check to determine if the query can be executed? Or are you actually looking to prove correctness?
If you just want to test that it executes you could put a limit very early on, something like this:
-- first part of query omitted for brevity
SELECT TOP 1000 outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
FROM MESSAGES outbound
-- bottom part of query omitted for brevity
Or, for fastest performance, limit the initial source:
-- first part of query omitted for brevity
SELECT outbound.MSG_ID
, outbound.MSG_TYPE
, outbound.FROM_ADDR
, outbound.TO_ADDR
, outbound.EMAIL_SUBJECT
, outbound.CREATION_DATE
, outbound.MQ_MSG_ID
FROM (SELECT TOP 1000 * FROM MESSAGES) outbound
-- bottom part of query omitted for brevity

SQL Query: How to Join Two SQL Queries in Oracle Report

I have been tasked with converting an old reports program to Oracle reports and I came to a halt when I needed to join two queries to make the report work. I'm not new to SQL, but I do need help on this one.
For Oracle Reports 11g, reports needs to show the results of the following two queries, therefore, these queries need to be joined together in one single SQL query for the report to work.
First query:
select table_name
, to_char(load_date, 'MM/DD/ YYYY') as XDATE
, to_char(number_name) as NUMBER NAME
, round(sysdate-load_date) as DAYS
, 'E' AS TABLEIND
from error_table
where load_date is not null
and round(sysdate-load_date) > 15
and number_name not in
(select number_name
from table_comments)
order by table_name
Second query:
select table_name
, to_char(load_date, 'MM/DD/ YYYY') as XDATE
, to_char(number_name) as NUMBER NAME
, round(sysdate-load_date) as DAYS
, 'O' AS TABLEIND
from other_table
where load_date is not null
and round(sysdate-load_date) > 15
and number_name not in
(select number_name
from table_comments)
order by table_name
The results of these two queries should show the results of these two queries with the first query first, and the second query second. Any help with this problem is highly appreciated.

( Query1
--leave out the "order by" line
)
UNION ALL
( Query2
--leave out the "order by" line, too
)
ORDER BY TABLEIND
, table_name

If you're trying to get these to come out in one result set, try a UNION between them. You can order the whole result set by TABLEIND, table_name to sort the way you want, I believe.

You can create a union query with the existing queries as inline views:
select 1 as whichQuery, q1.col, q1.col, ...
from
(select....) as q1
union all
select 2 as whichQuery, q2.col, q2.col, ...
from
(select ....) as q2
and then you can order by whichQuery. That guarantees the order you want in case TABLEIND alpha sort value should vary (and not sort in the order you want).

If you HAVE to have it in this format dump the results of the first query into a temp table with an identity column then dump the results of the second query into the same table.
Then select from that temp table sorted off that identity column

SQL query ...multiple max value selection. Help needed

Business World 1256987 monthly 10 2009-10-28
Business World 1256987 monthly 10 2009-09-23
Business World 1256987 monthly 10 2009-08-18
Linux 4 U 456734 monthly 25 2009-12-24
Linux 4 U 456734 monthly 25 2009-11-11
Linux 4 U 456734 monthly 25 2009-10-28
I get this result with the query:
SELECT DISTINCT ljm.journelname,ljm. subscription_id,
ljm.frequency,ljm.publisher, ljm.price, ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory
lsh,lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
ORDER BY ljm.publisher
What I need is the latest date in each journal?
I tried this query:
SELECT DISTINCT ljm.journelname, ljm.subscription_id,
ljm.frequency, ljm.publisher, ljm.price,ljd.receipt_date
FROM lib_journals_master ljm,
lib_subscriptionhistory lsh,
lib_journal_details ljd
WHERE ljd.journal_id=ljm.id
AND ljd.receipt_date = (
SELECT max(ljd.receipt_date)
from lib_journal_details ljd)
But it gives me the maximum from the entire column. My needed result will have two dates (maximum of each magazine), but this query gives me only one?

You could change the WHERE statement to look up the last date for each journal:
AND ljd.receipt_date = (
SELECT max(subljd.receipt_date)
from lib_journal_details subljd
where subljd.journelname = ljd.journelname)
Make sure to give the table in the subquery a different alias from the table in the main query.

You should use Group By if you need the Max from date.
Should look something like this:
SELECT
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
, **MAX(ljd.receipt_date)**
FROM
lib_journals_master ljm
, lib_subscriptionhistory lsh
, lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
GROUP BY
ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price

Something like this should work for you.
SELECT ljm.journelname
, ljm.subscription_id
, ljm.frequency
, ljm.publisher
, ljm.price
,md.max_receipt_date
FROM lib_journals_master ljm
, ( SELECT journal_id
, max(receipt_date) as max_receipt_date
FROM lib_journal_details
GROUP BY journal_id) md
WHERE ljm.id = md.journal_id
/
Note that I have removed the tables from the FROM clause which don't contribute anything to the query. You may need to replace them if yopu simplified your scenario for our benefit.

Separate this into two queries one will get journal name and latest date
declare table #table (journalName as varchar,saleDate as datetime)
insert into #table
select journalName,max(saleDate) from JournalTable group by journalName
select all fields you need from your table and join #table with them. join on journalName.

Sounds like top of group. You can use a CTE in SQL Server:
;WITH journeldata AS
(
SELECT
ljm.journelname
,ljm.subscription_id
,ljm.frequency
,ljm.publisher
,ljm.price
,ljd.receipt_date
,ROW_NUMBER() OVER (PARTITION BY ljm.journelname ORDER BY ljd.receipt_date DESC) AS RowNumber
FROM
lib_journals_master ljm
,lib_subscriptionhistory lsh
,lib_journal_details ljd
WHERE
ljd.journal_id=ljm.id
AND ljm.subscription_id = ljm.subscription_id
)
SELECT
journelname
,subscription_id
,frequency
,publisher
,price
,receipt_date
FROM journeldata
WHERE RowNumber = 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query, select from 2 tables random - sql

Since RAND() appears only in the ORDER BY clause, would it not only be evaluated once for the whole query, and not once per row?

Related

Sql query with group by takes too long

Row_number partition by performance

How to make testing an SQL Query faster?

SQL Query: How to Join Two SQL Queries in Oracle Report

SQL query ...multiple max value selection. Help needed

Categories

Resources