Can someone explain what's happening in this code? When I select two columns from different tables the output is always like this - sql

select num1.n, 2 from num1, num2
expected output
table num1 table num2
2 2
3 3
4 4
5 5
6 6
7
8
9
10
actual output
num1, num2
2 2
3 2
4 2
5 2
6 2
7 2
8 2
9 2
10 2
3 2
4 2
5 2
6 2
7 2
8 2
9 2
10 2
3 2
4 2
5 2
6 2
7 2
8 2
9 2
10 2

Number "2" is constant in your case.
You need to understand the correct relation between two tables.
After that you can build your select with JOIN to connect data from two tables correctly.
For example:
SELECT num1.n, num2.[field_from_table_num2]
FROM num1
JOIN num2 ON num2.[relation_field_or_key]= num1.[relation_field_or_key];

you have define static column so that why you receive 2 in second column try following query
select num1.n As 'First_Table', num2.n As 'Second_Table'
from num1, num2

Related

SQL append data based on multiple dates

I have two tables; one contains encounter dates and the other order dates. They look like this:
id enc_id enc_dt
1 5 06/11/20
1 6 07/21/21
1 7 09/15/21
2 2 04/21/20
2 5 05/05/20
id enc_id ord_dt
1 1 03/7/20
1 2 04/14/20
1 3 05/15/20
1 4 05/30/20
1 5 06/12/20
1 6 07/21/21
1 7 09/16/21
1 8 10/20/21
1 9 10/31/21
2 1 04/15/20
2 2 04/21/20
2 3 04/30/20
2 4 05/02/20
2 5 05/05/20
2 6 05/10/20
The order and encounter date can be the same, or differ slightly for the same encounter ID. I'm trying to get a table that contains all order dates before each encounter date. So the data would like this:
id enc_id enc_dt enc_key
1 1 03/7/20 5
1 2 04/14/20 5
1 3 05/15/20 5
1 4 05/30/20 5
1 5 06/11/20 5
1 1 03/7/20 6
1 2 04/14/20 6
1 3 05/15/20 6
1 4 05/30/20 6
1 5 06/12/20 6
1 6 07/21/21 6
1 1 03/7/20 7
1 2 04/14/20 7
1 3 05/15/20 7
1 4 05/30/20 7
1 5 06/12/20 7
1 6 07/21/21 7
1 7 09/15/21 7
2 1 04/15/20 2
2 2 04/21/20 2
2 1 04/15/20 5
2 2 04/21/20 5
2 3 04/30/20 5
2 4 05/02/20 5
2 5 05/05/20 5
Is there a way to do this? I am having trouble figuring out how to append the orders and encounter table for each encounter based on orders that occur before a certain date.
You may join the two tables as the following:
SELECT O.id, O.enc_id, O.ord_dt, E.enc_id
FROM
order_tbl O
JOIN encounter_tbl E
ON O.ord_dt <= E.enc_dt AND
O.id = E.id
See a demo from db<>fiddle.

How can I create a column of numbers that ascends after a certain amount of rows?

I have a column of scores going in descending order. I want to create a column of difficulty level with scale 1-10 going up every 37 rows for diffculty 1-7 and then 36 rows for 8-10. i have created a small example below where the difficulty goes down in 3 row intervals and the final difficulty '4' and '5' is 2 rows
In:
score
0 11
1 10
2 9
3 8
4 8
5 6
6 5
7 4
8 4
9 3
10 2
11 1
12 1
Out:
score difficulty
0 11 1
1 10 1
2 9 1
3 8 2
4 8 2
5 6 2
6 5 3
7 4 3
8 4 3
9 3 4
10 2 4
11 1 5
12 1 5
If I understand your problem correctly, you could do something like:
import pandas as pd
from random import randint
count = (37*7) + (36*3)
difficulty = [int(i/37) + 1 for i in range(37*7)] + [int(i/36) + 8 for i in range(36*3)]
df = pd.DataFrame({'score': [randint(0, 10) for i in range(count)]})
df['difficulty'] = difficulty

If a column value does not have a certain number of occurances in a dataframe, how to duplicate rows at random until that count is met?

Say that this is what my dataframe looks like
A B
0 1 5
1 4 2
2 3 5
3 3 3
4 3 2
5 2 0
6 4 5
7 2 3
8 4 1
9 5 1
I want every unique value in Column B to occur at least 3 times. So none of the rows with a B value of 5 are duplicated. The row with a column B value of 0 are duplicated twice. And the rest have one of their two rows duplicated at random.
Here is an example desired output
A B
0 1 5
1 4 2
2 3 5
3 3 3
4 3 2
5 2 0
6 4 5
7 2 3
8 4 1
9 5 1
10 4 2
11 2 3
12 2 0
13 2 0
14 4 1
Edit:
The row chosen to be duplicated should be selected at random
To random pick rows, I would use groupby apply with sample on each group. x of lambda is each group of B, so I use reapeat - x.shape[0] to find number of rows need to create. There may be some cases group B already has more rows than 3, so I use np.clip to force negative values to 0. Sample on 0 row is the same as ignore it. Finally, reset_index and append back to df
repeats = 3
df1 = (df.groupby('B').apply(lambda x: x.sample(n=np.clip(repeats-x.shape[0], 0, np.inf)
.astype(int), replace=True))
.reset_index(drop=True))
df_final = df.append(df1).reset_index(drop=True)
Out[43]:
A B
0 1 5
1 4 2
2 3 5
3 3 3
4 3 2
5 2 0
6 4 5
7 2 3
8 4 1
9 5 1
10 2 0
11 2 0
12 5 1
13 4 2
14 2 3

calculate the total value for each group using Calculated Column in Spotfire

I have a problem about the sum calculation for the rows using calculated column in Spotfire.
For example, the raw data is as below, the raw table is order by id, for each type, the sequence is 2,3,0.
id type value state
1 1 12 2
2 1 7 3
3 1 10 0
4 2 11 2
5 2 6 3
6 3 9 0
7 3 7 2
8 3 5 3
9 2 9 0
10 1 7 2
11 1 3 3
12 1 2 0
for type of each cycle of (2,3,0), I want to sum the value, then the result could be:
id type value state cycle time
1 1 12 2
2 1 7 3
3 1 10 0 29
4 2 11 2
5 2 6 3
6 3 7 2
7 3 5 3
8 3 9 0 21
9 2 9 0 26
10 2 7 2
11 2 3 3
12 2 2 0 12
note: only the row which its state is 0 will have the sum value , i think it will be easier to see the rules, when we order the type :
id type value state cycle time
1 1 12 2
2 1 7 3
3 1 10 0 29
4 2 11 2
5 2 6 3
9 2 9 0 26
10 2 7 2
11 2 3 3
12 2 2 0 12
6 3 7 2
7 3 5 3
8 3 9 0 21
thanks for your time and help!
Here is a solution for you.
Insert a Calculated Column RowId() and name it RowId
Insert a Calculated Column If(Mod([RowId],3)=0,[RowId] / 3,Ceiling([RowId] / 3)) and name it Groups
Insert a Calculated Column Sum([value]) OVER ([Groups]) and name it Running Sum
Insert a Calculated Column If([state] = 0,[RunningSum]) and name it OnlyState=0
The only thing to really explain here is #2. With the data sorted as you listed in your example, the last row for each group, based on the RowId, should be divisible by 3. We have to do it this way since your type field can have multiple groups for any given type. RowId 3, 6, 9, 12 etc will all have a Modulus of 0 since they are divisible by 3. This marks the last row in each set. If it is the last row, we just set it to RowId / 3. This gives us groups 1,2,3,4 etc... For the rows which aren't divisible by 3, we round them up to the nearest whole number of the divisor... which will be the last row in the set.
The last calculated column is the only way I know how to get ONLY the values you care about. If you use the If [state] = 0 logic anywhere else, you negate all other rows.

Return unique combinations from many to many join

I have a hierarchy table with the following data :
SOURCE TARGET Level ID
0 1 1 1
0 2 1 2
2 3 2 3
2 4 2 4
2 5 2 5
1 3 2 6
1 4 2 7
1 5 2 8
5 3 3 9
5 3 3 10
4 3 3 11
4 3 3 12
3 6 3 13
3 6 3 14
3 6 4 15
3 6 4 16
3 6 4 17
3 6 4 18
The SOURCE and TARGET rows are the original data and are used to connect between parents and children. for example, the third row (SOURCE 2, TARGET 3 on LEVEL 2) connects to the second row (SOURCE 0, TARGET 2 on LEVEL 1) since the Source of the first equals the target of the second.
The ID column is added at the end using a ROW_NUMBER function and is used to give each row a unique ID.
It may be easier to understand if SOURCE is replaced with PARENT and TARGET with CHILD.
I join the table to itself in order to find the "parent".
I want each "instance" of a "source" on each level to connect to one of its parents. It's not important which ones connect but all need to be connected and to different parents.
The final results should look something like this:
SOURCE TARGET Level ID P_ID
0 1 1 1 NULL
0 2 1 2 NULL
2 3 2 3 2
2 4 2 4 2
2 5 2 5 2
1 3 2 6 1
1 4 2 7 1
1 5 2 8 1
5 3 3 9 5
5 3 3 10 8
4 3 3 11 4
4 3 3 12 7
3 6 3 13 3
3 6 3 14 6
3 6 4 15 9
3 6 4 16 10
3 6 4 17 11
3 6 4 18 12
Any suggestions on how to write a good ms-sql query for this?
Link to sample data and SQL Fiddle
The query to use is below.
;with cte as (
select *,rn=row_number() over (partition by level, target
order by id),
lc=count(1) over (partition by level, target)
from tbl
)
select a.*, b.id as parent_id
from cte a
left join cte b on b.level=a.level-1
and b.target=a.source
and b.rn=(a.rn-1)%b.lc+1
order by id
Items are sequenced at each level/target combination
Children are linked to parents using by sequence, however if there are more children than parents, the MOD (%) operator takes care of going back to the first parent and continues distribution