SQLite calculating difference of date got incorrect result - sql

I am having trouble calculating date difference in SQLite.
I've set the value type to timestamp when setting up the tables, but the calculation for date seems only apply to the first number of my date entry.
I've try to use to_date('01/01/2020', 'mm/dd/yyyy') but then it return error saying not support to_date. My code is below, any suggestion would be much appreciated.
CREATE TABLE customer_join
(
id INT,
country_code VARCHAR(10),
country_descrip VARCHAR(255),
register_date TIMESTAMP,
customer_id INT,
PRIMARY KEY (id),
FOREIGN KEY (customer_id) REFERENCES customer(id)
);
CREATE TABLE customer_order
(
id INT,
item_name VARCHAR(25),
item_description VARCHAR(255),
number FLOAT(24),
order_date TIMESTAMP,
customer_id INT,
PRIMARY KEY (id),
FOREIGN KEY (customer_id) REFERENCES patient(id)
);
INSERT INTO customer_join
Values (1, 1, 'none', '1/22/2017', 100),
(2, 1, 'none', '1/23/2017', 101),
(3, 1, 'none', '1/24/2017', 102),
(4, 1, 'none', '1/25/2017', 103),
(5, 1, 'none', '1/26/2017', 104),
(6, 2, 'none', '1/27/2017', 101),
(7, 2, 'none', '1/28/2017', 106),
(8, 1, 'none', '1/29/2017', 107);
INSERT INTO customer_order
Values (1, 'A', 'none', 1, '2/23/2020', 101),
(2, 'B', 'none', 1, '3/11/2027', 100),
(3, 'B, C, D', 'none', 1, '4/10/2023', 100),
(4, 'B, C, E', 'none', 1, '4/11/2020', 100),
(5, 'R', 'none',1, '4/12/2099', 102);
SELECT (order_date - register_date) TIME_TO_ORDER
FROM customer_join cj
INNER JOIN
(SELECT customer_id , MIN(order_date) order_date
FROM customer_order
GROUP BY customer_id) co
ON cj.customer_id = co.customer_id;
The code gives me the result:
TIME_TO_ORDER
2
1
3
1
Which is not I wanted. I was trying to figure out how long does it take for customers to place their first order. Any suggestions?

First, you must change the format of the dates in both tables to YYYY-MM-DD, which is the only valid text date format for SQLite.
Then use the function julianday() to get the difference in days between the dates:
SELECT cj.customer_id,
julianday(co.order_date) - julianday(cj.register_date) TIME_TO_ORDER
FROM customer_join cj
INNER JOIN (
SELECT customer_id , MIN(order_date) order_date
FROM customer_order
GROUP BY customer_id
) co ON cj.customer_id = co.customer_id;
See the demo.
Results:
customer_id | TIME_TO_ORDER
----------: | ------------:
100 | 1175
101 | 1126
102 | 30028

Related

Finding top 10 products sold in a year

I have these tables below along with the definition. I want to find top 10 products sold in a year after finding counts and without using aggregation and in an optimized way. I want to know if aggregation is still needed or I can accomplish it without using aggregation. Below is the query. Can anyone suggest a better approach.
CREATE TABLE Customer (
id int not null,
first_name VARCHAR(30),
last_name VARCHAR(30),
Address VARCHAR(60),
State VARCHAR(30),
Phone text,
PRIMARY KEY(id)
);
CREATE TABLE Product (
ProductId int not null,
name VARCHAR(30),
unitprice int,
BrandID int,
Brandname varchar(30),
color VARCHAR(30),
PRIMARY KEY(ProductId)
);
Create Table Sales (
SalesId int not null,
Date date,
Customerid int,
Productid int,
Purchaseamount int,
PRIMARY KEY(SalesId),
FOREIGN KEY (Productid) REFERENCES Product(ProductId),
FOREIGN KEY (Customerid) REFERENCES Customer(id)
)
Sample Data:
insert into
Customer(id, first_name, last_name, address, state, phone)
values
(1111, 'andy', 'johnson', '123 Maryland Heights', 'MO', 3211451234),
(1112, 'john', 'smith', '237 Jackson Heights', 'TX', 3671456534),
(1113, 'sandy', 'fleming', '878 Jersey Heights', 'NJ', 2121456534),
(1114, 'tony', 'anderson', '789 Harrison Heights', 'CA', 6101456534)
insert into
Product(ProductId, name, unitprice, BrandId, Brandname)
values
(1, 'watch',200, 100, 'apple'),
(2, 'ipad', 429, 100, 'apple'),
(3, 'iphone', 799, 100, 'apple'),
(4, 'gear', 300, 110, 'samsung'),
(5, 'phone',1000, 110, 'samsung'),
(6, 'tab', 250, 110, 'samsung'),
(7, 'laptop', 1300, 120, 'hp'),
(8, 'mouse', 10, 120, 'hp'),
(9, 'monitor', 400, 130, 'dell'),
(10, 'keyboard', 40, 130, 'dell'),
(11, 'dvddrive', 100, 130, 'dell'),
(12, 'dvddrive', 90, 150, 'lg')
insert into
Sales(SalesId, Date, CustomerID, ProductID, Purchaseamount)
values (30, '01-10-2019', 1111, 1, 200),
(31, '02-10-2019', 1111, 3, 799),
(32, '03-10-2019', 1111, 2, 429),
(33, '04-10-2019', 1111, 4, 300),
(34, '05-10-2019', 1111, 5, 1000),
(35, '06-10-2019', 1112, 7, 1300),
(36, '07-10-2019', 1112, 9, 400),
(37, '08-10-2019', 1113, 5, 2000),
(38, '09-10-2019', 1113, 4, 300),
(39, '10-10-2019', 1113, 3, 799),
(40, '11-10-2019', 1113, 2, 858),
(41, '01-10-2020', 1111, 1, 400),
(42, '02-10-2020', 1111, 2, 429),
(43, '03-10-2020', 1112, 7, 1300),
(44, '04-10-2020', 1113, 7, 2600),
(45, '05-10-2020', 1114, 7, 1300),
(46, '06-10-2020', 1114, 7, 1300),
(47, '07-10-2020', 1114, 9, 800)
Tried this:
SELECT PCY.Name, PCY.Year, PCY.SEQNUM
FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS YEAR, COUNT(P.Productid) AS CNT,
RANK() OVER (PARTITION BY Extract('Year' from S.Date) ORDER BY COUNT(P.Productid) DESC) AS RANK
FROM Sales S inner JOIN
Product P
ON S.Productid = P.Productid
) PCY
WHERE PCY.RANK <= 10;
I am seeing this error:
ERROR: column "p.name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 2: FROM (SELECT P.Name AS Name, Extract('Year' from S.Date) AS ...
^
SQL state: 42803
Character: 52
I don't understand why you don't want to use an aggregate function when you have to aggregate over your data. This query works fine, without any issues on the GROUP BY:
WITH stats AS (
SELECT EXTRACT
( YEAR FROM DATE ) AS y,
P.productid,
P.NAME,
COUNT ( * ) numbers_sold,
RANK ( ) OVER ( PARTITION BY EXTRACT ( YEAR FROM DATE ) ORDER BY COUNT ( * ) DESC ) r
FROM
product
P JOIN sales S ON S.Productid = P.Productid
GROUP BY
1,2
)
SELECT y
, name
, numbers_sold
FROM stats
WHERE r <= 10;
This works because the productid is the primary key that has a functional dependency to the product name.
By the way, tested on version 12, but it should work on older and newer versions as well.

Get most recent row inserted with the least specificity

I'll first explain the data model then the desired results and what I have tried.
I have vehicles and sales tables:
CREATE TABLE VEHICLE
(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
BRAND INT NOT NULL,
MODEL VARCHAR(255),
VERSION VARCHAR(255),
UNIQUE(BRAND, MODEL, VERSION),
FOREIGN KEY(BRAND) REFERENCES BRAND(ID)
)
CREATE TABLE SALES
(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
VEHICLE_ID INT NOT NULL,
DATE DATE NOT NULL,
SALE INT NOT NULL,
CREATED_DATE DATETIME NOT NULL DEFAULT GETDATE(),
FOREIGN KEY (VEHICLE_ID) REFERENCES VEHICLE(ID)
)
This way I can insert several entries for the same vehicle for the same date (when I want to update, I insert a new row)
INSERT INTO SALES (VEHICLE_ID, DATE, SALE, USER_ID)
VALUES (1, '2018-01-01', 2, 3) -- then later i update by inserting a new row
(1, '2018-01-01', 4, 3)
I want to retrieve the last sale inserted for a specific date range (using the DATE), then filter for a specific BRAND, or model or version.
I got it working by doing this
SELECT
S.DATE AS date, SUM(S.SALE_PROJECTION) AS saleProjection
FROM
SALE_PROJECTION S,
(SELECT MAX(ID) AS id
FROM SALE_PROJECTION
WHERE DATE >= CAST(#dateStart AS DATE)
AND DATE <= CAST(#dateEnd AS DATE)
GROUP BY DATE, VEHICLE_ID) S_M,
VEHICLE V
WHERE
1 = 1
AND S.ID = S_M.ID
AND S.VEHICLE_ID = V.ID
AND V.BRAND = 1
AND V.MODEL = 'A6'
AND V.VERSION = '1.0'
GROUP BY S.DATE
ORDER BY DATE
The problem is i want to get the sales for the brand 1 that has the least specificity, meaning:
If i have 3 vehicles:
(1, 'A3', '1.0'),
(1, 'A3', '2.0'),
(1, 'A3', null),
(1, null, null);
if i insert a sale (1, 2018-01-01, 2, 3)
if i insert a sale (2, 2018-01-01, 3, 3) -- the sum for 2018-01-01 would be 5
but then insert a sale for (2, 2018-01-01, 3, 3) -- the sum for 2018-01-01 has to be 3, because it's the last inserted with the least specifity
But the oposite must be true as well
if i insert a sale (3, 2018-01-01, 4, 3)
then insert a sale for (1, 2018-01-01, 1, 3)
then insert a sale for (2, 2018-01-01, 1, 3)
the sum for 2018-01-01 has to be 2, because it's the last inserted
The most general combination of Brand, Model, Version has to "hide" the most specific.
Do i need to change my data model? or this is possible?
I can give more examples if needed.
Thanks in advance

SQL Query with double distinct dates iteration, and start date < my date < final date

I have a kind of rental system database where user can rent an entire house, or just a room of the house.
I have a table called offers which has columns id, room_id and a few more.
If room_id = NULL, it refers to an entire house.
I have a table called availability which has columns offer_id, room_id, date, status (available, unavailable)
If room_id = NULL, it refers to an availability of an entire house.
select `offer_id` , `room_id`
from `availability`
where `date` > CAST('2016-05-17' as date)
and `date` <= CAST('2016-05-21' as date)
and `status` = 'available'
group by `offer_id`
having COUNT(DISTINCT `date`) = DATEDIFF('2016-05-21', '2016-05-17')
Ok, but my problem is: if a room is unavailable at day 20 but the house have another room available at day 20 the query will return a false and indistinct select. I need all those availability where room_id is null(an entire house), and a separated result where room_id is not null and distinct when compare the dates for each offer_id (offer_id= 1 and room_id = 1, offer_id = 1 and room_id = 2 ...)
SAMPLE DATA:
http://sqlfiddle.com/#!9/f5dfe
CREATE TABLE `availability` (
`offer_id` int(10) UNSIGNED NOT NULL,
`room_id` int(10) UNSIGNED DEFAULT NULL,
`date` date NOT NULL,
`status` enum('available','UNAVAILABLE') COLLATE utf8_unicode_ci NOT NULL DEFAULT 'available'
);
INSERT INTO `availability` (`offer_id`, `room_id`, `date`, `status`) VALUES
(1, NULL, '2016-05-18', 'UNAVAILABLE'),
(1, NULL, '2016-05-19', 'available'),
(1, NULL, '2016-05-20', 'available'),
(1, NULL, '2016-05-21', 'available'),
(1, 1, '2016-05-18', 'available'),
(1, 1, '2016-05-19', 'UNAVAILABLE'),
(1, 1, '2016-05-20', 'available'),
(1, 1, '2016-05-21', 'available'),
(1, 2, '2016-05-18', 'available'),
(1, 2, '2016-05-19', 'UNAVAILABLE'),
(1, 2, '2016-05-20', 'available'),
(1, 2, '2016-05-21', 'available'),
(1, 3, '2016-05-18', 'available'),
(1, 3, '2016-05-19', 'available'),
(1, 3, '2016-05-20', 'UNAVAILABLE'),
(1, 3, '2016-05-21', 'available');
using the query above will give me one result (offer_id = 1), but the correct is no results.
because none entire house (room_id = null) or a room is available when search the dates where all dates appear available between start date and final date
Updated as initial answer was incorrect:
The following fiddle has more data: http://sqlfiddle.com/#!9/d6602/1/0
SELECT `Offer_id`, `room_id`
from
( select `offer_id`, `room_id`
from `availability`
where `date` > CAST('2016-05-17' as date)
and `date` <= CAST('2016-05-21' as date)
and `status` = 'available'
group by `offer_id`, `room_id`
having COUNT(DISTINCT `date`) = DATEDIFF('2016-05-21', '2016-05-17')) As HousesAndRooms
WHERE NOT `room_id` IS NULL OR (`room_id` is null AND`offer_id` NOT IN(
( select `offer_id`
from `availability`
where `date` > CAST('2016-05-17' as date)
and `date` <= CAST('2016-05-21' as date)
and `status` = 'UNAVAILABLE'
and not `room_id` is null
group by `offer_id`, `room_id`
having COUNT(DISTINCT `date`) > 0 )
) )
The query above selects all available offers (houses and rooms) for a date range, where room_id is null (i.e. a whole house) it will check if there are any unavailable rooms (not room_id is null) for a given date range.

General database normalization

Suppose that I have a table of products that I sell to my customers.
Each record has a productID and productName.
I can sell more than 1 product to each customer, but I want to allow customers to only order certain products.
What would those tables look like?
This is what I have so far:
PRODUCTS
+------------+-------------+
| PROD_ID | PROD_NAME |
+------------+-------------+
CUSTOMER
+------------+-------------+
| CUST_ID | CUST_NAME |
+------------+-------------+
ORDERS
+------------+-------------+
| ORDER_ID | CUST_ID |
+------------+-------------+
I wrote and tested this with PostgreSQL, but the principles are the same for any SQL dbms.
Tables for products and customers are straightforward.
create table products (
prod_id integer primary key,
prod_name varchar(35) not null
);
insert into products values
(1, 'Product one'), (2, 'Product two'), (3, 'Product three');
create table customers (
cust_id integer primary key,
cust_name varchar(35) not null
);
insert into customers values
(100, 'Customer 100'), (200, 'Customer 200'), (300, 'Customer 300');
The table "permitted_products" controls which products each customer can order.
create table permitted_products (
cust_id integer not null references customers (cust_id),
prod_id integer not null references products (prod_id),
primary key (cust_id, prod_id)
);
insert into permitted_products values
-- Cust 100 permitted to buy all three products
(100, 1), (100, 2), (100, 3),
-- Cust 200 permitted to buy only product 2.
(200, 2);
Customer 300 has no permitted products.
create table orders (
ord_id integer primary key,
cust_id integer not null references customers (cust_id)
);
insert into orders values
(1, 100), (2, 200), (3, 100);
The table "order_line_items" is where the "magic" happens. The foreign key constraint on {cust_id, prod_id} prevents ordering products without permissions.
create table order_line_items (
ord_id integer not null,
line_item integer not null check (line_item > 0),
cust_id integer not null,
prod_id integer not null,
foreign key (ord_id) references orders (ord_id),
foreign key (cust_id, prod_id) references permitted_products (cust_id, prod_id),
primary key (ord_id, line_item)
);
insert into order_line_items values
(1, 1, 100, 1), (1, 2, 100, 2), (1, 3, 100, 3);
insert into order_line_items values
(2, 1, 200, 2);
insert into order_line_items values
(3, 1, 100, 3);
You can start an order for customer 300 . . .
insert into orders values (4, 300);
. . . but you can't insert any line items.
insert into order_line_items values (4, 1, 300, 1);
ERROR: insert or update on table "order_line_items" violates foreign key constraint "order_line_items_cust_id_fkey"
DETAIL: Key (cust_id, prod_id)=(300, 1) is not present in table "permitted_products".

Compare start and end dates across multiple rows

I have a table of subscriptions for contacts. A contact can have multiple subscriptions:
CREATE TABLE contact (
id INTEGER NOT NULL,
name TEXT,
PRIMARY KEY (id)
);
CREATE TABLE subscription (
id INTEGER NOT NULL,
contact_id INTEGER NOT NULL REFERENCES contact(id),
start_date DATE,
end_date DATE,
PRIMARY KEY (id)
);
I need to get all subscriptions for a given by contact that do not have a subscription
that starts on the same date as the end date of the another subscription for the same
contact.
So for the given data:
INSERT INTO contact (id, name) VALUES
(1, 'John'),
(2, 'Frank');
INSERT INTO subscription (id, contact_id, start_date, end_date) VALUES
(1, 1, '2012-01-01', '2013-01-01'),
(2, 1, '2013-01-01', '2014-01-01'),
(3, 2, '2012-01-01', '2012-09-01'),
(4, 2, '2013-01-01', '2014-01-01');
I want to get subscriptions with ids of 2, 3, 4 but not 1, because the contact 'John'
has a subscription with a start_date on the same day (2013-01-01) as the end_date
for subscription with id of 1.
What is the best way to achieve this?
SQL Fiddle
select *
from subscription s0
where not exists (
select 1
from subscription s1
where
s0.contact_id = s1.contact_id
and s1.start_date = s0.end_date
)
order by contact_id, id