Importing data using temp tables in Power BI - sql

I have the following SQL query I want to use to import data into Power BI. It involves the creation of a temp table and using that table as the main data source. How can I do this in Power BI? I tried using this query in the editor when loading data from a database, but I keep getting an error like so
I basically used this dataset https://www.kaggle.com/kyanyoga/sample-sales-data and loaded it into a postgressql database.
-- 1. Create temp table to house temporary results
DROP TABLE IF EXISTS product_quantity;
CREATE TEMP TABLE product_quantity
(product_line varchar, this_month_quantity integer, last_month_quantity integer)
--2. Quantity ordered for each Product line for current month is inserted into temporary table.
INSERT INTO product_quantity (product_line, this_month_quantity, last_month_quantity)
SELECT "productline", SUM("quantityordered"), 0
FROM test_schema.sales_data_sample
where "month_id" = 3 and "year_id" = 2003
GROUP BY "productline";
--3. Quantity ordered for each Product line for last month is inserted into temporary table.
INSERT INTO product_quantity (product_line, this_month_quantity, last_month_quantity)
SELECT "productline", 0, SUM("quantityordered")
FROM test_schema.sales_data_sample
where "month_id" = 2 and "year_id" = 2003
GROUP BY "productline";
--4. Retrieve required results.
select
"product_line",
sum("this_month_quantity") as "this_month_quantity",
sum("last_month_quantity") as "last_month_quantity"
FROM product_quantity
group by "product_line"

Does this query run without an error?
I've converted your query to one big inline query.
select
ST."product_line",
sum(ST."this_month_quantity") as "this_month_quantity",
sum(ST."last_month_quantity") as "last_month_quantity"
FROM
(
SELECT "productline",
SUM("quantityordered") as this_month_quantity,
0 as last_month_quantity
FROM test_schema.sales_data_sample
where "month_id" = 3 and "year_id" = 2003
GROUP BY "productline"
UNION ALL
SELECT "productline",
0,
SUM("quantityordered")
FROM test_schema.sales_data_sample
where "month_id" = 2 and "year_id" = 2003
GROUP BY "productline"
) as ST
group by ST."product_line"
(note I've just taken a guess at conversion - I don't have a postgresql to test on)

Related

SQL INSERT INTO WHERE NOT EXISTS with multiple conditions

I have a SQL Server database. I am looking to insert some values into multiple tables, where the data does not already exist.
Example:
Table 1
ID
Name
Material
Other
1
Aluminum
2014
v1
2
Magnesium
2013
v2
I want to develop a stored procedure such that it will insert into a table the following information:
Aluminum | 2013
My current stored procedure does not let me do this as it recognizes Magnesium | 2013 and rejects 2013 because it is being duplicated.
Also how would we compare multiple column values, for example:
INSERT WHERE NOT EXISTS (Material = 2014 AND Other = v3)
Current stored procedure:
IF EXISTS(SELECT *
FROM dbo.[01_matcard24]
WHERE NOT EXISTS (SELECT [01_matcard24].Element
FROM dbo.[01_matcard24]
WHERE dbo.[01_matcard24].Element = #new_element)
AND NOT EXISTS (SELECT [01_matcard24].Material
FROM dbo.[01_matcard24]
WHERE dbo.[01_matcard24].Material = #new_material)
)
INSERT INTO dbo.[15_matcard24_basis-UNUSED] (Element, Material)
VALUES (#new_element, #new_material)
Rather than using an IF statement, keep it set-based as part of the INSERT
Create your EXISTS sub-query to detect unique rows, not individual columns
INSERT INTO dbo.[15_matcard24_basis-UNUSED] (Element, Material)
SELECT #new_element, #new_material
WHERE NOT EXISTS (
SELECT 1
FROM dbo.[15_matcard24_basis-UNUSED]
WHERE Element = #new_element AND Material = #new_material
-- AND Other = #Other
);

For loop with output arrays

In snowflake :
I have two tables available:
"SEG_HISTO": This is a segmentation run once a month.
columns: Client ID /date (1st of each month) /segment.
"TCK": a table that contains the tickets with the columns: Ticket ID / Customer ID / Date / Amount.
For each customer ID in the "SEG_HISTO" table, I searched for all the customer's tickets over a rolling year and associated the sum of the amount spent:
SELECT SEG_OMNI.*, TCK_12M.TOTAL_AMOUNT_HT
FROM "SHARE"."DATAMARTS_DATASCIENCE"."SEG_OMNI" SEG_OMNI
LEFT OUTER JOIN
(
SELECT DISTINCT PR_ID_BU,
SUM(TOTAL_AMOUNT_HT) AS "TOTAL_AMOUNT_HT",
COUNT(*) "NB_ACHAT"
FROM
(
SELECT * FROM "SHARE"."RAW_BDC"."TCK"
WHERE TO_DATE(DT_SALE) >= DATEADD(YEAR, -1, '2022-07-01') -- <<<===== date add manually
)
GROUP BY PR_ID_BU
) TCK_12M
ON SEG_OMNI."pr_id_bu" = TCK_12M.PR_ID_BU
Now I need to create a for loop that iterates this for each date in the SEG_OMNI table (SELECT DISTINCT TO_DATE(DT_MAJ) DT FROM "SHARE"."DATAMARTS_DATASCIENCE"."SEG_HISTO") and stack the output in a view.
And it is at this level where I block
Thank you for your help in advance
As Dave said in the comments, it would be better if you could figure out how to run all this in one query, instead of running the same query multiple times.
But as you are asking how to output the results of multiple queries out of one stored procedure I'm going to give you the pattern for that here. I'm also assuming you want this in a SQL script (we could use Python/Java/JS instead):
declare
your_var string;
all_dates cursor for (
select dates
from your_table
);
begin
-- create a table to store results
create or replace temp table discovery_results(x string, y string, z int);
for record in all_dates do
-- for each date run the query an insert results into the table created
insert into discovery_results
select x, y, z
from the_query
where (:dates_cursor_data)
;
end for;
return 'run [select * from discovery_results] to find the results';
end;
select *
from discovery_results

Create an Impala text table where rows meet a condition

I am trying to create a table in Impala (SQL) that takes rows from a parquet table. The data represents bike rides in a city. Rows will be imported into the new table if there starting code (a string, ex: '6100') shows up more than 100 times in the first table. Heres what I have so far:
#I am using Apache Impala via the Hue Editor
invalidate metadata;
set compression_codec=none;
invalidate metadata;
Set compression_codec=gzip;
create table bixirides_parquet (
start_date string, start_station_code string,
end_date string, end_station_code string,
duration_sec int, is_member int)
stored as parquet;
Insert overwrite table bixirides_parquet select * from bixirides_avro;
invalidate metadata;
set compression_codec=none;
create table impala_out stored as textfile as select start_date, start_station_code, end_date, end_station_code, duration_sec, is_member, count(start_station_code) as count
from bixirides_parquet
having count(start_station_code)>100;
For some reason the statement will run, but no rows are inserted in the new table. It should import a row into the new table if that rows starting code shows up more than 100 times in the original table. I think I'm wording my select statement improperly but I'm not sure how exactly.
I think the final query you want is:
select start_date, start_station_code, end_date,
end_station_code, duration_sec, is_member, cnt
from (select bp.*,
count(*) over (partition by start_station_code) as cnt
from bixirides_parquet bp
) bp
where cnt > 100;

SQL - Create table from Select + user defined columns and values

Currently I have the following SELECT statement:
CREATE TABLE TEST AS
SELECT ROW_ID,
PROM_INTEG_ID,
INTEGRATION_ID,
BILL_ACCNT_ID,
SERV_ACCT_ID,
CFG_STATE_CD
FROM PRODUCTS
WHERE PROD_ID = 'TestProduct'
AND STATUS_CD = 'Active';
However I have to add some additional columns which do not exist in the PRODUCTS table and define them with my own name .e.g HIERARCHY
I tried using the WITH operand in my SQL query but it keeps failing as the syntax is wrong.
CREATE TABLE TEST AS
SELECT ROW_ID,
PROM_INTEG_ID,
INTEGRATION_ID,
BILL_ACCNT_ID,
SERV_ACCT_ID,
CFG_STATE_CD
WITH
PRODUCT_HEIRARCHY varchar2(30) 'Test123Value'
FROM PRODUCT
WHERE PROD_ID = 'TestProduct'
AND STATUS_CD = 'Active';
So in summary, I want to pull in columns from an existing table as well as defining some of my own.
Any help appreciated
Just add the columns to the select:
CREATE TABLE TEST AS
SELECT ROW_ID, PROM_INTEG_ID, INTEGRATION_ID, BILL_ACCNT_ID, SERV_ACCT_ID, CFG_STATE_CD,
CAST('Test123Value' AS VARCHAR2(30)) as PRODUCT_HIERARCHY
FROM PRODUCTS
WHERE PROD_ID = 'TestProduct' AND STATUS_CD = 'Active';
Note that the cast() is not necessary. But it is a good idea if you want the column to have a specific type.
Also using CTE i.e. WITH clause as known commonly, you could create table.
CREATE TABLE t
AS
WITH data AS (
SELECT...
)
SELECT *
FROM data

Update SQL table from csv on multiple columns

I want a basic update procedure that updates a temporary table and orders it by PrimID and myDates, and then updates a permanent table. The data structure looks like this:
PrimID MyDates Price
1 1/1/2014 1
1 1/2/2014 2
2 1/1/2014 11
2 1/2/2014 12
3 1/1/2014 21
3 1/2/2014 22
The csv file looks exactly the same, just without the header column names. Here is my code thus far:
CREATE Table #TempT
(
PrimID Int,
myDate Date,
myPrice Float
);
BULK
INSERT #TempT
FROM 'D:\MyWerk\SQL\TEST_dPrice_Data.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
Select * From #TempT
Order by PrimID,myDate
Drop Table #TempT
What is missing, and what I am trying to get to, is the UPDATE of the permanent table with the ordered #TempT, ordered by PrimID and then myDates(oldest to lastest). If there are PrimID & myDates data in the csv that are already in the permanent table, I want to overwrite the data in the permanent file as well. Also, is there a better way to get the data in chronological order, other than using order by?
I use SQL Server 2012.
Much appreciated.
Don't try to store your data in SQL tables in some kind of row order -- this is inefficient. You can sort when you query the data.
As for the insert/update behavior, a SQL merge does this quite well. After your Bulk Insert, you can execute something like this:
MERGE PermanentT AS [TARGET]
USING #TempT AS [SOURCE]
ON [TARGET].PrimId = [SOURCE].PrimId
AND [TARGET].MyDates = [SOURCE].MyDates
WHEN MATCHED AND [TARGET].PRICE <> [SOURCE].PRICE
THEN UPDATE SET [TARGET].PRICE = [SOURCE].PRICE
WHEN NOT MATCHED
THEN INSERT (PrimID, myDate, myPrice)
VALUES (SOURCE.PrimID, SOURCE.myDate, SOURCE.myPrice);