Oracle Data warehouse - Handling data that there ins't in any dimension - sql

We are implementing a data warehouse and we are stuck in a problem. We want to load this data to fact table(num_class is not a FK/PK);
STUDENT NUM_CLASS COURSE CLASS SHIFT YEAR DATE
---------- --------- ------------ ----- ------ ----- ----------
1100882 1 EID SAD OT 13/14 28-05-2014
1100882 1 EID SAD PL2 13/14 28-05-2014
This is the procedure we use to load the data to fact table:
PROCEDURE load_fact_table_presencas IS
v_source_lines INTEGER;
BEGIN
-- JUST FOR STATISTICS
SELECT COUNT(*)
INTO v_source_lines
FROM T_CLEAN_presencas;
INSERT INTO t_fact_presencas(aluno_id, uc_id,curso_id,turno_id,id_tempo,numero_aula)
SELECT
pre.aluno_id,
uc.uc_id,
curso.curso_id,
t.turno_id,
ano.id_tempo,
pre.numero_aula
FROM
t_dim_turnos t,
T_DIM_ANO_LETIVO ano,
T_DIM_UCS uc,
T_DIM_CURSOS curso,
T_CLEAN_PRESENCAS pre
WHERE
-- joins to get dimension keys using sources' natural keys
ano.data_completa = TO_CHAR( pre."data", 'dd-mm-yyyy') AND
NVL(pre.TURNO_ID, pck_error_codes.c_load_invalid_dim_record_Nkey) = t.turno_id_natural AND
curso.curso_id_natural = pre.curso_id AND
uc.uc_id_natural = pre.uc_id AND
--t.turno_id_natural = pre.turno_id AND
-- excludes EXPIRED VERSIONS of curso and us dimensions
uc.is_expired_version = 'NO' AND
curso.IS_EXPIRED_VERSION = 'NO';
pck_log.write_log('Info: '||SQL%ROWCOUNT ||' fact(s) (presencas) loaded');
pck_log.write_log('Done!');
EXCEPTION
WHEN NO_DATA_FOUND THEN
pck_log.write_log('Info: No facts generated from '||v_source_lines||' source clean presencas');
WHEN OTHERS THEN
pck_log.write_log('Error: Could not load fact table (presencas) ['||sqlerrm||']');
RAISE e_load;
END;
The problem is that some of the data that we want to load isn't at any dimension and when we load the data we give to each row a "special registry". This returns a primary key restriction error because it is possible that all the other keys are the same.
PK is STUDENT_id + COURSE_id + CLASS_id + SHIFT + DATE_id. My dimensions are STUDENT, COURSE, CLASS, SHIFT, DATE.

Related

Stored procedure

I have 4 tables to be used in the procedure
business(abnnumber,name)
business_industry(abnnumber,industryid)
industry(industryid,unionid)
trade_union(unionid)
I was assigned to get trade union title in one line and all the businesses ABNNUMBER and business name in different lines using stored procedure.
What I tried is:
CREATE [OR REPLACE] PROCEDURE INDUSTRY_INFORMATION
[enter image description here][1](P_INDUSTRYID in integer,
P_UNIONTITLE OUT VARCHAR2,
P_BUSINESSNAME OUT VARCHAR2) AS
BEGIN
SELECT TRADE_UNION.UNIONTITLE, BUSINESS.BUSINESSNAME INTO
P_UNIONTITLE,P_BUSINESSNAME
FROM BUSINESS inner join BUSINESS_INDUSTRY ON
BUSINESS.ABNNUMBER=BUSINESS_INDUSTRY.ABNNUMBER
INNER JOIN INDUSTRY ON BUSINESS_INDUSTRY.INDUSTRYID=INDUSTRY.INDUSTRYID
INNER JOIN TRADE_UNION ON INDUSTRY.UNIONID=TRADE_UNION.UNIONID;
END;
Sample data is in the link http://www.mediafire.com/file/8c4dwn4n88n8a42/strd_procedure.txt
Required output should be
UNIONTITLE (one line)
ABNNUMBER BUSINESS NAME (next line)
`` [1]: https://i.stack.imgur.com/sGuwe.jpg
I suspect that You need something like this:
create or replace procedure industry_info is
begin
for r in (
select tu.uniontitle ut,
listagg('['||b.abnnumber||'] '||b.businessname, ', ')
within group (order by b.businessname) blist
from business b
join business_industry bi on b.abnnumber = bi.abnnumber
join industry i on bi.industryid = i.industryid
join trade_union tu on i.unionid = tu.unionid
group by tu.uniontitle )
loop
dbms_output.put_line(r.ut);
dbms_output.put_line(r.blist);
dbms_output.put_line('-----');
end loop;
end;
Function listagg is available in Oracle 11g or later.
Output:
Cleaners' Union
[12345678912] Consolidated Proerty Services, [12345678929] Gold Cleaning Services, [12345678926] Home Cleaning Services, [12345678924] Shine Cleaning
-----
Construction Workers' Union
[12345678920] Build a House, [12345678919] Construction Solutions, [12345678922] Joe's Rubbish Removal, [12345678918] Leak and Roof Repair, [12345678928] Muscle Rubbish Removals
-----
Electricians' Union
[12345678916] Change the Fuse Electricals, [12345678921] Hire a Wire, [12345678917] Vicky Electricals
-----
Movers' Union
[12345678913] Kohlan Movers, [12345678925] Moveit
-----
Mowers' Union
[12345678923] Do it Right Mowers, [12345678911] James Mowers and Landscape
-----
Plumbers' Union
[12345678927] 24X7 Plumbing Service, [12345678915] Anytime Plumbers, [12345678914] Pumbers Delivered
-----

Split Text into Table Rows with Read-Only Permissions

I am a read-only user for a database with he following problem:
Scenario:
Call center employees for a company submit tickets to me through our database on behalf of our clients. The call center includes alphanumeric lot numbers of an exact length in their message for me to troubleshoot. Depending on how many times a ticket is updated, there could be several messages for one ticket, each of them having zero or more of these alphanumeric lot numbers embedded in the message. I can access all of these messages with Oracle SQL and SQL Tools.
How can I extract just the lot numbers to make a single-column table of all the given lot numbers?
Example Data:
-- Accessing Ticket 1234 --
SELECT *
FROM communications_detail
WHERE ticket_num = 1234;
-- Results --
TICKET_NUM | MESSAGE_NUM | MESSAGE
------------------------------------------------------------------------------
1234 | 1 | A customer recently purchased some products with
| | a lot number of vwxyz12345 and wants to know if
| | they have been recalled.
------------------------------------------------------------------------------
1234 | 2 | Same customer found lots vwxyz23456 and zyxwv12345
| | in their storage as well and would like those checked.
------------------------------------------------------------------------------
1234 | 3 | These lots have not been recalled. Please inform
| | the client.
So-Far:
I am able to isolate the lot numbers of a constant string with the following code, but it gets put into standard output and not a table format.
DECLARE
msg VARCHAR2(200) := 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.';
cnt NUMBER := regexp_count(msg, '[[:alnum:]]{10}');
BEGIN
IF cnt > 0 THEN
FOR i IN 1..cnt LOOP
Dbms_Output.put_line(regexp_substr(msg, '[[:alnum:]]{10}', 1, i));
END LOOP;
END IF;
END;
/
Goals:
Output results into a table that can itself be used as a table in a larger query statement.
Somehow be able to apply this to all of the messages associated with the original ticket.
Update: Changed the example lot numbers from 8 to 10 characters long to avoid confusion with real words in the messages. The real-world scenario has much longer codes and very specific formatting, so a more complex regular expression will be used.
Update 2: Tried using a table variable instead of standard output. It didn't error, but it didn't populate my query tab... This may just be user error...!
DECLARE
TYPE lot_type IS TABLE OF VARCHAR2(10);
lots lot_type := lot_type();
msg VARCHAR2(200) := 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.';
cnt NUMBER := regexp_count(msg, '[[:alnum:]]{10}');
BEGIN
IF cnt > 0 THEN
FOR i IN 1..cnt LOOP
lots.extend();
lots(i) := regexp_substr(msg, '[[:alnum:]]{10}', 1, i);
END LOOP;
END IF;
END;
/
This is a regex format which matches the LOT mask you provided: '[a-z]{3}[0-9]{5}' . Using something like this will help you avoid the false positives you mention in your question.
Now here is a read-only, pure SQL solution for you.
with cte as (
select 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.' msg
from dual)
select regexp_substr(msg, '[a-z]{3}[0-9]{5}', 1, level) as lotno
from cte
connect by level <= regexp_count(msg, '[a-z]{3}[0-9]{5}')
;
I'm using the WITH clause just to generate the data. The important thing is the the use of the CONNECT BY operator which is part of Oracle's hierarchical data syntax but here generates a table from one row. The pseudo-column LEVEL allows us to traverse the string and pick out the different occurrences of the regex pattern.
Here's the output:
SQL> r
1 with cte as ( select 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.' msg from dual)
2 select regexp_substr(msg, '[a-z]{3}[0-9]{5}', 1, level) as lotno
3 from cte
4 connect by level <= regexp_count(msg, '[a-z]{3}[0-9]{5}')
5*
LOTNO
----------
xyz23456
zyx12345
SQL>

Getting Duplicates in Person ID and ASSIGNMENT_ID

This is the query I'm using:
select DISTINCT "HRG_GOAL_ACCESS"."PERSON_ID" as "PERSON_ID",
"HRG_GOAL_ACCESS"."BUSINESS_GROUP_ID" as "BUSINESS_GROUP_ID",
"HRG_GOALS"."GOAL_ID" as "GOAL_ID",
"HRG_GOALS"."ASSIGNMENT_ID" as "ASSIGNMENT_ID",
"HRG_GOALS"."GOAL_NAME" as "GOAL_NAME",
"HRG_MASS_REQ_RESULTS"."ORGANIZATION_ID" as "ORGANIZATION_ID",
"HRG_MASS_REQ_RESULTS"."RESULT_CODE" as "RESULT_CODE",
"HRG_GOAL_PLN_ASSIGNMENTS"."CREATED_BY" as "CREATED_BY"
from "FUSION"."HRG_GOAL_PLN_ASSIGNMENTS" "HRG_GOAL_PLN_ASSIGNMENTS",
"FUSION"."HRG_MASS_REQ_RESULTS" "HRG_MASS_REQ_RESULTS",
"FUSION"."HRG_GOALS" "HRG_GOALS",
"FUSION"."HRG_GOAL_ACCESS" "HRG_GOAL_ACCESS"
where "HRG_GOAL_ACCESS"."PERSON_ID"="HRG_GOALS"."PERSON_ID"
and "HRG_MASS_REQ_RESULTS"."PERSON_ID"="HRG_GOALS"."PERSON_ID"
and "HRG_GOAL_PLN_ASSIGNMENTS"."PERSON_ID"="HRG_MASS_REQ_RESULTS"."PERSON_ID"
Output
PERSON_ID BUSINESS_GROUP_ID GOAL_ID ASSIGNMENT_ID GOAL_NAME RESULT_CODE CREATED_BY
---------------- ----------------- --------------- --------------- ------------------ -------------------- -------------------
300000048030404 1 300000137711224 300000048033078 NANO_CLASS SUCCESS anonymous G_1
300000048030404 1 300000137637946 300000048033078 INCREASE SALES BY 40% SUCCESS REDDI.SAREDDY G_1
300000048030404 1 300000137637946 300000048033078 INCREASE SALES BY 40% SUCCESS CURTIS.FEITTY
Your output does not contain duplicates. You have more than one row for PERSON_ID (300000048030404) but that's because the master table (? HRG_GOAL_ACCESS ?) has multiple rows in its child tables.
Each row has different details, so the set is valid. There are different values of HRG_GOALS.GOAL_ID, HRG_GOALS.GOAL_NAME and HRG_GOAL_PLN_ASSIGNMENTS.CREATED_BY.
If this response does not make you happy you need to explain more clearly what your desire output would look like. Alternatively you need to figure out your data model and understand why your query returns the data it does. Probably you have a missing join condition; the use of distinct could be hindering you in finding that out.

SQL Determining differences in near-identical rows

If I have a table of correct data I need to check with my actual table to make sure the data is correct and I have some rows like the following:
Data_Check_Table
FRUIT ------- PRICE ------- WEEKS_FRESH ------- SUPPLIER
Apple $1 1 Big Co.
Banana $1 1 Super Co.
and the actual table with this info:
Data_Table
FRUIT ------- PRICE ------- WEEKS_FRESH ------- SUPPLIER
Apple $2 1 Big Co.
Banana $1 1 Super Co.
...and assume there are many other rows, some match up fine and others have inconsistencies in certain areas (Maybe the wrong price? Or wrong supplier? Maybe even both.) How would I do a select to find these rows that are inconsistent with the actual data?
Select dt.Fruit,dt.Price, dt.Weeks_Fresh,dtc.Fruit,dtc.Price, dtc.Weeks_Fresh,...
From DataTable dt
FULL OUTER JOIN
DataTable_Check dtc
ON dt.Fruit = dtc.Fruit
AND dt.Price = dtc.Price
.....
Where dt.Fruit IS NULL OR dtc.Fruit IS NULL
The full join includes records from each table regardless of whether there is a match, so if either side is null then you know there is a mismatch.
The following to find actual records not matching correct records:
select *
from Data_Table
minus
select *
from Data_Check_Table

Count Unique Occurrences PowerPivot

I am new to PowerPivot and DAX formulas. I assume that what I am trying to do is very basic and it has probably been answered somewhere, I just don't know what to search on it find it.
I am trying to determine the percent of sales people who had a sale in a given quarters. I have two tables, one that lists the sales people and one that list all the sales for a quarter. For example
Employee ID
123
456
789
Sales ID - Emp ID - Amount
135645 ---- 123 ----- $50
876531 ---- 123 ----- $127
258546 ---- 123 ----- $37
516589 ---- 789 ----- $128
998513 ---- 789 ----- $79
As a result, the pivot table would look like this:
Emp ID - % w/ sales
123 -------- 100%
456 -------- 0%
789 -------- 100%
Total ------- 66%
If you can point me to a post where this has been addressed or let me know the best way to address this I would appreciate it. Thank you.
Here's a simple way of doing this (assuming table names emps and sales):
=IF (DISTINCTCOUNT ( sales[Emp ID] ) = BLANK (),
0,
DISTINCTCOUNT ( sales[Emp ID] )
)
/ COUNTROWS ( emps )
The IF() is only required to ensure that people who haven't made a sale appear in the Pivot. All the actual formula is doing is dividing the number of sales rows by the number of employee rows.
Jacob
You'll need to remove the text that begins with --. I wanted to describe what the DAX is doing. This may not do what you want because it only factors the employees in the context. E.x.: If the user filtered out all employees that didn't have sales, should the grand total be 100% or 66%? For the former, you'll need to use the ALL DAX function and the below DAX does the latter. I'm very new to DAX so I'm sure there's a better way to do what you want.
=IF
(
-- are we processing 1 employee or multiple employees? (E.x.: The grand total processes multiple employees...)
COUNTROWS(VALUES(employee[Employee ID])) > 1,
--If Processing multiple employees do X / Y
-- Where X = The number of employees that have sales
-- Where Y = The number of employees selected by the user
COUNTROWS(FILTER(employee, NOT(ISBLANK(CALCULATE(COUNT(sales[Sales ID])))))) / COUNTROWS(employee),
-- If processing single employee, return 0 if they have no sales, and 1 if they have sales
IF
(
ISBLANK(CALCULATE(COUNT(sales[Sales ID]))),
0,
1
)
)