I am trying to build online bus ticket reservation system on a package base, where the ticket includes other features like ticket of some site seeing places, bus ticket will be same for all passenger but price for other features may depends on passenger type(Adult/Child/Senior ...). There will be a tables like other_feature,passenger_type,feature_type, so i am confused how to relate these with reservation system please any help be appreciated . Thanks in advance!
My table Structure
table reservation_details
id
passenger_id
travel_detail_id
purchase_detail_id
reserved_date
seat_no
depature_date
table travel_details
id
bus_id
route_id
depature_time
arrival_time
fare
freq_detail_id
table routes
id
name
distance
from_city
to_city
table freq_details
id
sun
mon
tue
wed
thu
fri
sat
Related
After a lot of research and head picking, I'm still unable to find a good/clean solution to convert an entity-attribute-value-timestamp table to an scd type 2 dimension.
Here's the issue:
I have a CRM source that stores all history in a EAVT model (Entity/Attribute/Value of the attribute/valid_from/valid_to).
So for every object (Company, product...etc) I have a table with the current state that is in a relational model, and another history table that contains all value changes to all attributes with a valid_from/valid_to column for validity of the values themselves.
I want to be able to merge these two tables into an SCD table with a Valid_To/Valid_From and a column per attribute.
To give an example:
Company has two tables:
Current state of the Companies:
company_id
name
number_of_employees
city
1
Company 1
500
Paris
2
Company 2
500
Paris
History Table:
company_id
attribute
value
valid_from
valid_to
1
city
New York
01/01/2020
01/05/2022
1
city
Paris
01/05/2022
12/31/9999
1
number_of_employees
50
01/01/2021
01/01/2022
1
number_of_employees
100
01/01/2022
12/31/9999
What I want to have as a result is the following:
company_id
name
city
number_of_employees
valid_from
valid_to
is_active
1
Company 1
New York
null
01/01/2020
01/01/2021
false
1
Company 1
New York
50
01/01/2021
01/01/2022
false
1
Company 1
New York
100
01/01/2022
01/01/2022
false
1
Company 1
Paris
100
01/05/2022
12/31/9999
true
So based on this example, we have a company that started on 01/01/2020 with New York as city and number of employees wasn't populated at that time.
We then modified our company to add 50 as the number of employees, this happened on 01/01/2021.
We modified our company again on 01/01/2022 to change the number of employees to 100, only to change the city of the company from New York to Paris on 01/05/2021.
This gives us 4 states for the company, so our SCD should contain a row per state or 4 rows.
The dates should be calculated to overlap and valid_from should be set to the valid_to of the attribute that changed from the "history" table, and valid_to should be set to the valid_from of the attribute that changed from the "history" table.
To add more complexity to the task, imagine we have about 120 attributes but also if a company was never changed (just created and still has the same state from creation) then it won't exist in the "Current State" table. So in our example, Company 2 will not exist in the history table at all and will have to be read from the first table into the SCD (union between current table and history result table). Fun right! :)
To give you a sense of the technical environment, the CRM is hubspot, data is replicated from hubspot to BigQuery and the reporting tool is Power BI.
I have tried to use pivoting in both Power BI and BigQuery, which is the standard solution when it comes to EAV model tables, but I'm stuck at the calculation of the valid/from valid/to in the result SCD. ( example of using the pivoting here: https://dba.stackexchange.com/questions/20275/solutions-for-reporting-off-of-an-eav-structured-database )
I need one process that can be applied to multiple tables (because this example is only for company, but I have also other objects that I need to convert into SCD).
So what is the best way to convert this EAVT data into an SCD without falling into a labyrinth of hard code and performance issues? And how to calculate the valid_from/valid_to dynamically<
Whether it's BigQuery or Power Query or just theoretical, any solutions, tips, ideas or just plain opinion is highly appreciated as this is the last step into the adoption of a whole data culture in the company I work for, and if I cannot make this, well... my credibility will be hit! so please help a fellow lost IT professional! :D
Too broad question - but anyway, below is just to give you an idea. Obviously it does not cover all cases - but hope you can work it further out
select company_id, city, number_of_employees, min(day) valid_from, max(day) valid_to
from (
select * from (
select company_id, attribute, value, day
from history,
unnest(generate_date_array(date(valid_from), if(valid_to = '9999-12-31', date('2222-12-31'), date(valid_to)))) day
)
pivot (any_value(value) for attribute in ('city', 'number_of_employees'))
)
group by company_id, city, number_of_employees
if applied to sample data as in your question
with history as (
select 1 company_id, 'city' attribute, 'New York' value, '2020-01-01' valid_from, '2022-01-05' valid_to union all
select 1, 'city', 'Paris', '2022-01-05', '2222-12-31' union all
select 1, 'number_of_employees', '50', '2021-01-01', '2022-01-01' union all
select 1, 'number_of_employees', '100', '2022-01-01', '2222-12-31'
)
output is
I want to show the amount of people in each contract status historically. I have a list of every contract's start date, suspension dates, expiration date, and termination state. As a brief example this is what my table looks like:
Client
Location
StartDate
ExpDate
SuspensionStart
SuspensionEnd
TerminatedDate
Jane
NJ
1/1/22
1/1/23
3/1/22
5/1/22
NULL
John
NY
11/15/22
11/15/23
NULL
NULL
3/8/22
Alice
NY
3/12/21
3/12/22
6/1/21
8/1/21
NULL
Jack
NJ
6/20/21
6/20/22
NULL
NULL
NULL
My goal is to get my table to look like this for the month of March
Active
Suspended
Expired
Terminated
1
1
1
1
Then be able to drill down by location too.
Since I have two variables that I want to count by the date (count if expdate=month/year and count of terminateddate=month/year) and then two variables with through dates.
One more piece of context...this data is pulled from a using a sql query from a shared snowflake database. There is no calendar table and I cannot create one except by a view which I used
select
dateadd(day,seq,dt::date) dat
,year(dat) as "YEAR"
,quarter(dat) as "QUARTER OF YEAR"
,month(dat) as "MONTH"
,day(dat) as "DAY"
,dayofmonth(dat) as "DAY OF MONTH",
dayofweek(dat) as "DAY OF WEEK",dayname(dat) as dayName,
dayofyear(dat) as "DAY OF YEAR"
from (
select seq4() as seq, dateadd(month, 1, '2015-01-01'::date) dt
from table(generator(rowcount => 16000))
)
I haven't used scaffolding before, and unsure which date to build the relationship on/join on?
Scaffolding is best done by Tableau Prep. There are multiple steps involved and Prep can step through them while it is very challenging with Tableau Desktop. See https://www.tableau.com/about/blog/2018/12/scaffold-data-tableau-prep-fill-gaps-your-data-set-99389 for one example of how to scaffold the data.
You can apply the techniques in the blog article and create the four metrics that you want to show.
I am using HANA and I have two tables. Both tables have
Customer_Number
Transaction_Number
Transaction_Date
Transaction_Week
Spend
Means_of_Payment
The only difference in the TYPE of information between the tables is the Means_of_Payment. Table 1 has transactions paid with the debit card or credit card and table 2 has transactions paid with a gift card or a voucher. The customers in table 1 are the same as customers in table 2
Focus: Customers use gift card/vouchers mostly in Apr
My aim: What is a customer’s spend 2 weeks before their transaction in April and 2 weeks after their transaction in April. Idea is to see if and how their spend changed after using a gift card/voucher
Problem: Each customer’s Apr transaction date will be different so I need to create a dynamic query that will look up a customer’s transaction date in the month of April (table 2) and give me their spend 2 weeks before and after that date (table 1) and I’m really unsure about how to do this.
Expected Result: Customer 1 Apr transaction date = 1/04/2018 so their relevant date range is 2 weeks before 1/04/2018 and 2 weeks after.
Customer 2 Apr transaction date = 5/04/2018 so their relevant date range in 2 weeks before 5/04/2018 and 2 weeks after.
I want to return the transaction_number, spend and Means_of_payment in each customer's relevant date range
Any ideas?
Thanks in advance, I appreciate your help 😊
I only have code that is joining the two tables
SELECT *
FROM "Table2" AS A
LEFT JOIN "Table1" AS B
ON A.CUSTOMER = B.CUSTOMER_NUMBER
Try this one:
SELECT cards.*
FROM "Table2" AS coupons
JOIN "Table1" AS cards
ON coupons.CUSTOMER = coupons.CUSTOMER_NUMBER
and cards.Transaction_Date between ADD_DAYS(coupons.Transaction_Date, -14) and
ADD_DAYS(coupons.Transaction_Date, 14)
We have a table that stores the number of trip and type of trip made for a given day by a driver as follows
Date Delivery Pick-up
==== ======== =======
01/01/2013 5 0
We also have an attendance table that stores the driver attendance as follows.
AttDate InTime OutTIme THours
======= ====== ======= ======
01/01/2013 10:00 13:00 3
How do I calculate the average time between the 5 trip using the employee Thours using MSSQL 2008/2012?. This is for performance monitoring purpose.
I'm assuming that you have some sort of driver_id, which you'll have to add to the join, or, as #Randy points out, you won't be able to determine which rows belong to which drivers.
Here's the general form of the query:
SELECT Trip.business_day,
DATEDIFF(minute, arrivedAt, leftBy) / CASE WHEN deliveries = 0
THEN 1
ELSE deliveries END
as average_deliveries_in_minutes
FROM Trip
JOIN Driver_Attendence
ON Driver_Attendence.business_day = Trip.business_day
(working SQL Fiddle example)
You didn't actually list what you wanted to do deliveries = 0 is true - the CASE is there so you don't get 'divide-by-zero' errors. Excluding drivers without any deliveries would allow you to remove the case, and just reference the column.
I have data(result / output) in a table like this:
Project code project name associates time efforts in days
1 Analytics amol,manisha,sayali,pooja (21+17+20+17)=57
I need to calculate the time efforts in days. I have done it for February and I have added each persons days he has worked in that month. I mean I have all days minus absentee of any day of all associates.
So, I need to do this by SQL queries.
I have one table which contains all the associates present with dates.
Like this:
UID username date
So can any one give me a suggestion how I could do this?
It will be a better design to have a separate table to store projectid, team member id and his/her efforts in days. so that you can write a simple join query to achieve what you want.
Here is what I would do. Change you tables so you have:
projects
project_code project_name
1 Analytics
users
UID username date
1 amol
2 manisha
3 sayali
projects_users
project_code uid effort
1 1 21
1 2 17
1 3 20
Now you can query the result you asked for like this:
SELECT
p.project_code,
p.project_name,
GROUP_CONCAT(DISTINCT u.username SEPARATOR ', ') AS associates,
SUM(pu.effort) effort
JOIN users AS u
JOIN projects_users AS pu
FROM projects p
GROUP BY project_code