Problem selecting the latest record in JOIN - sql

These are my 2 tables:
CREATE TABLE `documents` (
`Document_ID` int(10) NOT NULL auto_increment,
`Document_FolderID` int(10) NOT NULL,
`Document_Name` varchar(150) NOT NULL,
PRIMARY KEY (`Document_ID`),
KEY `Document_FolderID` (`Document_FolderID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=331 ;
CREATE TABLE `files` (
`File_ID` int(10) NOT NULL auto_increment,
`File_DocumentID` int(10) NOT NULL,
`File_Name` varchar(255) NOT NULL,
PRIMARY KEY (`File_ID`),
KEY `File_DocumentID` (`File_DocumentID`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=333 ;
There can be multiple files to 1 document. I am trying to SELECT all of the documents with a JOIN on the files table but I only want 1 file record which is the latest one.
Here is my query I have come up with that doesn't quite work, can anyone suggest the right way?
SELECT `documents`.*
FROM `documents`
INNER JOIN (
SELECT MAX(`File_ID`), *
FROM `files`
WHERE `File_DocumentID` = `documents`.`Document_ID`
GROUP BY `File_ID` ) AS `file1`
ON `documents`.`Document_ID` = `file1`.`File_DocumentID`
WHERE `documents`.`Document_FolderID` = 94
ORDER BY `documents`.`Document_Name`
*edit: the error is Unknown column 'documents.Document_ID' in 'where clause'

Use:
SELECT d.*, f.*
FROM DOCUMENTS d
JOIN FILES f ON f.file_document_id = d.document_id
JOIN (SELECT t.file_document_id,
MAX(t.file_id) AS max_file_id
FROM FILES t
GROUP BY t.file_document_id) x ON x.file_document_id = f.file_document_id
AND x.max_file_id = f.file_id
The derived table/inline view called "x" is a join to the same table, all it does is tweak the records coming from the FILES table to be the highest per file_document_id...

Don't group by file_id, but by File_documentid.

I think I see what's wrong... You have GROUP BY File_ID, but I guess you really want GROUP BY File_DocumentID instead.

Related

SQLite: Get Output From Two Tables Using Common Reference ID

I am new in SQLite and i have been working on an issue for quite a long time.
Lets say we have 2 database table say tbl_expense and tbl_category. Please find below the following table structure.
tbl_category
CREATE TABLE IF NOT EXISTS tbl_category(
category_id INTEGER PRIMARY KEY AUTOINCREMENT,
category_name VARCHAR(20) DEFAULT NULL,
category_desc VARCHAR(500) DEFAULT NULL,
category_icon VARCHAR(100) DEFAULT NULL,
category_created timestamp default CURRENT_TIMESTAMP
)
tbl_expense
CREATE TABLE IF NOT EXISTS tbl_expense(
expense_id INTEGER PRIMARY KEY AUTOINCREMENT,
expense_name VARCHAR(20) DEFAULT NULL,
expense_desc VARCHAR(500) DEFAULT NULL,
expense_type VARCHAR(20) DEFAULT NULL,
expense_amt DECIMAL(6.3) DEFAULT NULL,
expense_date TIMESTAMP DEFAULT NULL,
expense_category INTEGER DEFAULT NULL,
expense_created_date timestamp DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (expense_category) REFERENCES tbl_category(category_id)
ON DELETE SET NULL
)
Assume we have data in the tables like this below.
Expected Output:
Assure we have category_id and expense_category as common fields. How can i create an SQL Query where i can list all categories and sum of their expense amount as follows.
Please help me on this issue.
You need an INNER join of the tables and aggregation:
SELECT c.category_name Category,
SUM(e.expense_amt) Amount
FROM tbl_category c INNER JOIN tbl_expense e
ON e.expense_category = c.category_id
GROUP BY c.category_id;
If you want all categories from the table tbl_category, even those that are not present in tbl_expense, use a LEFT join and TOTAL() aggregate function:
SELECT c.category_name Category,
TOTAL(e.expense_amt) Amount
FROM tbl_category c LEFT JOIN tbl_expense e
ON e.expense_category = c.category_id
GROUP BY c.category_id;

Get rows that no foreign keys point to

I have two tables
CREATE TABLE public.city_url
(
id bigint NOT NULL DEFAULT nextval('city_url_id_seq'::regclass),
url text,
city text,
state text,
country text,
common_name text,
CONSTRAINT city_url_pkey PRIMARY KEY (id)
)
and
CREATE TABLE public.email_account
(
id bigint NOT NULL DEFAULT nextval('email_accounts_id_seq'::regclass),
email text,
password text,
total_replied integer DEFAULT 0,
last_accessed timestamp with time zone,
enabled boolean NOT NULL DEFAULT true,
deleted boolean NOT NULL DEFAULT false,
city_url_id bigint,
CONSTRAINT email_accounts_pkey PRIMARY KEY (id),
CONSTRAINT email_account_city_url_id_fkey FOREIGN KEY (city_url_id)
REFERENCES public.city_url (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
I want to come up with a query that fetches rows in the city_url only if there is no row in the email_account pointing to it with the city_url_id column.
NOT EXISTS comes to mind:
select c.*
from city_url c
where not exists (select 1
from email_account ea
where ea.city_url_id = c.id
);
There's also this option:
SELECT city_url.*
FROM city_url
LEFT JOIN email_account ON email_account.city_url_id = city_url.id
WHERE email_account.id IS NULL
A NOT EXISTS is absolutely the answer to the "... if there is no row ...".
Nonetheless it would be preferable to accomplish this by selecting then difference quantity.
Which is in principle:
SELECT a.*
FROM table1 a
LEFT JOIN table2 b
ON a.[columnX] = b.[columnY]
WHERE b.[columnY] IS NULL
Using the tablenames here, this would be:
SELECT c.*
FROM city_url c
LEFT JOIN email_account e
ON c.id = e.city_url
WHERE e.city_url IS NULL
I believe NOT IN could be used here as well, although this might be less performant on large datasets:
SELECT *
FROM city_url
WHERE id NOT IN (
SELECT city_url_id FROM email_account
)

Two problems with my query: Show null values and order by before group by

I'm having major problems with my query. I want to show all results in the source table even if there is no pricing entry in the right table.
My order by is also not working. I want to order by product_pricing.PP_CashPrice prior to grouping by.
Here is my SQL code:
SELECT * FROM source
LEFT JOIN product_pricing ON source.Source_ID = product_pricing.Source_ID
WHERE (product_pricing.Product_ID = '234'
OR product_pricing.PP_ID = NULL)
AND source.Source_Active = 'Yes'
GROUP by source.Source_ID
ORDER by PP_CashPrice desc
I basically need it to show all sources. The right column will have duplicates but I only need to show the highest one.
My right column is as follows:
CREATE TABLE product_pricing ( PP_ID int(10) NOT NULL AUTO_INCREMENT, PP_Type varchar(150) NOT NULL, PP_CashPrice decimal(10,2) NOT NULL, PP_DateObtained date NOT NULL, PP_TimeObtained time NOT NULL, PP_Active varchar(3) NOT NULL, PP_Postcode varchar(150) NOT NULL, Source_ID int(10) NOT NULL, SC_ID int(10) NOT NULL, Product_ID int(10) NOT NULL, PRIMARY KEY (PP_ID) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
You should not use a where clause on a "Left joined" table. Put the condition in the where clause
I would also use a COALESCE operator for the ordering clause, and probably add an ordering on s.Source_ID if you want different sourceId with "inner pricing" ordering.
SELECT * FROM source s
LEFT JOIN product_pricing pp ON s.Source_ID = pp.Source_ID AND pp.PP_ID = '234'
AND s.Source_Active = 'Yes'
GROUP by s.Source_ID
ORDER by s.Source_ID, COALESCE(p.PP_CashPrice, 0) desc

Help with complex join conditions

I have the following mysql table schema:
SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
--
-- Database: `network`
--
-- --------------------------------------------------------
--
-- Table structure for table `contexts`
--
CREATE TABLE IF NOT EXISTS `contexts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`keyword` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;
-- --------------------------------------------------------
--
-- Table structure for table `neurons`
--
CREATE TABLE IF NOT EXISTS `neurons` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;
-- --------------------------------------------------------
--
-- Table structure for table `synapses`
--
CREATE TABLE IF NOT EXISTS `synapses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`n1_id` int(11) NOT NULL,
`n2_id` int(11) NOT NULL,
`context_id` int(11) NOT NULL,
`strength` double NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;
What SQL can I write to get all Neurons associated with a specified context along with the sum of the strength column for the synapses associated with each neuron?
I'm using the following query, which returns the sum of the strength of the synapses associated with one neuron. I need to get information for all of the neurons:
/* This query finds how strongly the neuron with id 1 is connected to the context with keyword ice cream*/
SELECT SUM(strength) AS Strength FROM
synapses
JOIN contexts AS Context ON synapses.context_id = Context.id
JOIN neurons AS Neuron ON Neuron.id = synapses.n1_id OR Neuron.id = synapses.n2_id
WHERE Neuron.id = 1 AND Context.keyword = 'ice cream'
For example, that query returns one row, where Strength is 2. Ideally, I could have one column for the neurons.id, one for neurons.name, and one for SUM(synapses.strength) with one record for each distinct neuron.
Use:
SELECT DISTINCT
n.*,
COALESCE(x.strength, 0) AS strength
FROM NEURONS n
JOIN SYNAPSES s ON n.id IN (s.n1_id, s.n2_id)
JOIN CONTEXTS c ON c.id = s.context_id
LEFT JOIN (SELECT c.id AS c_id,
n.id AS n_id,
SUM(strength) AS Strength
FROM SYNAPSES s
JOIN CONTEXTS c ON c.id = s.context_id
JOIN NEURONS n ON n.id IN (s.n1_id, s.n2_id)
GROUP BY c.id, n.id) x ON x.c_id = c.id
AND x.n_id = n.id
Does this do what you want?
SELECT contexts.keyword, neurons.id, neurons.name, SUM(synapses.strength)
FROM neurons
INNER JOIN synapses ON neurons.id = synapses.n1_id OR neurons.id = synapses.n2_id
INNER JOIN contexts ON synapses.context_id = contexts.id
GROUP BY contexts.keyword, neurons.id, neurons.name

Getting sum() on a different distinct row MySQL

I was looking on different questions on this issue, but couldn't find an answer for my problem.
This is my query:
SELECT SUM( lead_value ) AS lead_value_sum, count( DISTINCT phone ) AS SUM, referer
FROM leads t1
INNER JOIN leads_people_details t2 ON t1.lead_id = t2.lead_id
INNER JOIN user_to_leads t3 ON t1.lead_id = t3.lead_id
WHERE lead_date
BETWEEN 20100716000000
AND 20100716235959
AND t1.site_id =8
GROUP BY t1.referer
I am trying to sum up the lead_value only of unique phone numbers. The count (Distinct phone) actually works and gives me the number of unique phones for each referer, but I can't seem to understand how should I SUM the lead_value for unique phone numbers at each referer.
Would appreciate any help you can give me,
Eden
Edit: Table Structures
CREATE TABLE user_to_leads
(
user_idINT(10) NOT NULL,
lead_idINT(10) NOT NULL,
site_idINT(10) NOT NULL,
lead_value INT(10) NOT NULL
)
CREATE TABLE leads
(
lead_id INT(100) NOT NULL auto_increment ,
site_id INT(10) NOT NULL ,
lead_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ,
vaild_date TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
referer VARCHAR(255) NOT NULL,
KEYWORD VARCHAR(255) NOT NULL,
upsaleINT(11) NOT NULL DEFAULT '0' ,
vaild INT(2) NOT NULL,
PRIMARY KEY (lead_id),
KEY lead_date (lead_date)
)
CREATE TABLE leads_people_details
(
lead_id INT(100) NOT NULL auto_increment ,
fullnameVARCHAR(255) NOT NULL,
phone VARCHAR(12) NOT NULL ,
email VARCHAR(255) NOT NULL,
homeVARCHAR(255) NOT NULL,
browser VARCHAR(255) NOT NULL,
browser_version VARCHAR(100) NOT NULL,
resolutionVARCHAR(255) NOT NULL,
IPVARCHAR(255) NOT NULL,
statusVARCHAR(255) NOT NULL DEFAULT '0',
COMMENT text NOT NULL,
PRIMARY KEY (lead_id)
)
You say
For a particular referer,phone, the
lead_value will always be the same
Based on the limited information you have given I think this should return the right answer. If you update your question with the requested information it will probably be possible to improve upon it though.
SELECT SUM(lead_value ) AS lead_value_sum, count(phone ) AS phone_count, referer
FROM
(
SELECT DISTINCT lead_value, phone, referer
FROM leads t1
INNER JOIN leads_people_details t2 ON t1.lead_id = t2.lead_id
INNER JOIN user_to_leads t3 ON t1.lead_id = t3.lead_id
WHERE lead_date
BETWEEN 20100716000000
AND 20100716235959
AND t1.site_id =8
) derived
GROUP BY referer
Upated after table structure posted
I don't really understand why have both leads_people_details and leads got a primary key and auto_increment column of lead_id that you are joining on? That would imply a 1-1 relationship between leads and leads_people_details? If so one of them probably shouldn't be an auto_increment to avoid the possibility of the ids getting out of synch without you realising.
Also there is no Primary Key on the user_to_leads table. Should there one on user_id, lead_id, site_id? Additionally you are not currently filtering by siteid on that table. Is that intentional? If not if you do that does that stop the duplicate records from coming back? If it doesn't then can you describe the significance of user_id in that table? You earlier said that For a particular referer,phone, the lead_value will always be the same can it differ by user_id? If so which should be used? If not why is user_id in that table?
A provisional query that might be closer is here but there are still the unresolved queries above.
SELECT SUM(lead_value ) AS lead_value_sum, count(phone ) AS phone_count, referer
FROM leads t1
INNER JOIN leads_people_details t2 ON t1.lead_id = t2.lead_id
INNER JOIN user_to_leads t3 ON t1.lead_id = t3.lead_id
and t1.site_id = t3.site_id
WHERE lead_date
BETWEEN 20100716000000
AND 20100716235959
AND t1.site_id =8