Generate random records from the table tblFruit based on the field Type - sql

I will need your help to generate random records from the table tblFruit based on the field Type (without no duplication)
As per the above table.
There are 4 type of fruit number 1,2,3,4
I want to generate x records dynamically from the table tblFruit (e.g 7 records).
Let say I need to get 7 random record of fruit .
My result should contains fruit of the different types. However, we need to ensure that the result contains only 7 records.
i.e
2 records of type 1,
2 records of type 2,
2 records of type 3,
1 records of type 4
e.g
Note: If i want to generate 10 records (without no duplication),
then i will get 2 records of each type and the two remaining records randomly of any type.
Much grateful for your help.

I might suggest:
select top (7) f.*
from tblfruit f
order by row_number() over (partition by type order by newid());
This will actually produce a result with approximately the same number of rows of each type (well, off by 1), but that meets your needs.

Related

SQL new variable using multiple conditions (count of occurrences in 6 month look-back period using timestamp for each unique ID)

I am trying to achieve the following:
Attached is what my data looks like.
I want to create 2 new variables which counts the number of times 'Target' (variable 1) and 'Competitor' appears (variable 2), within the last 6 months of a given date_of_prescription. This would be done for every unique D_PRESCRIBER_ID.
So for example:
For ID: 1003000902 prescribing on 2020-03-18 date, the COMPETITOR drug. When you look at the rows before that, you can see that within 6 months prior to the 2020-03-18 date, there are 2 Target drugs prescribed and 0 competitor drugs prescribed. So my variable values will be: 2 (variable 1) and 0 (variable 2)
My data is much larger than what the screenshot looks like. It has more variables and 1000's of unique D_PRESCRIBER_IDs. Each row is not a unique ID, there are duplicates in the data for various date_of_prescription timestamps. These variables need to be created in my select statements in order to keep the rest of the data the same.
Any help here would be awesome. Thanks!

Extract the highest key:value pair from a string in Standard SQL

I have the following data type below, it is a type of key value pair such as 116=0.2875. Big Query has stored this as a string. What I am required to do is to extract the key i.e 116 from each row.
To make things more complicated if a row has more than one key value pair the iteration to be extracted is the one with the highest number on the right e.g {1=0.1,2=0.8} so the extracted number would be 2.
I am struggling to use SQL to perform this, Particularly as some rows have one value and some have multiple:
This is as close as I have managed to get where I can create a bit of code to extract the highest right hand value (which I don't need) but I just cant seem to create something to either get the whole key/value pair which would be fine and work for me or just the key which would be great.
column
,(SELECT MAX(CAST(Values AS NUMERIC)) FROM UNNEST(JSON_EXTRACT_ARRAY(REPLACE(REPLACE(REPLACE(column,"{","["),"}","]"),"=",","))) AS Values WHERE Values LIKE "%.%") AS Highest
from `table`
Here is some sample data:
1 {99=0.25}
2 {99=0.25}
3 {99=0.25}
4 {116=0.2875, 119=0.6, 87=0.5142857142857143}
5 {105=0.308724832214765}
6 {105=0.308724832214765}
7 {139=0.5712754555198284}
8 {127=0.5767967894928858}
9 {134=0.2530120481927711, 129=0.29696599825632086, 73=0.2662459427947186}
10 {80=0.21242613001118038}
Any help on this conundrum would be greatly appreciated!
Consider below approach
select column,
( select cast(split(kv, '=')[offset(0)] as int64)
from unnest(regexp_extract_all(column, r'(\d+=\d+.\d+)')) kv
order by cast(split(kv, '=')[offset(1)] as float64) desc
limit 1
) key
from your_table
if applied to sample data in your question - output is

How to combine a row of cells in VBA if certain column values are the same

I have a database where all of the input from the user (through a userform) gets stored. In the database, each column is a different category for the type of data (ex. date, shift, quantity, etc) and the data from the userform input gets put into its corresponding category. For some of the data, all the data is the same except for the quantity. I was wondering how I could combine these rows into one and add the quantities to each other for the whole database (ex. combining the first and third data entries). I have tried playing around with a couple different loops but can't seem to figure anything out.
Period Date Line Shift Type Quantity
4 x 2 4/3/18 A 3 14 18
4 x 2 4/3/18 A 3 13 12
4 x 2 4/3/18 A 3 14 15
Thank you!
If you're looking to modify the underlying database, you might be able to query the data into the format you want by including all the other columns in a GROUP BY statement, save the result to another table, then replace the original table with the properly formatted one.
If you have the data in Excel and you just want to view it with the duplicate rows summed, a Pivot Table would be a good choice. You can select all the other columns as rows for the Pivot Table and sum of Quantity as the values.

Decrement all values in a column after insert at top SQL

Before Inserting
Id Priority
1 . 1
2 . 2
3 . 3
After Inserting Id: 4, Priority 2
Id Priority
1 . 1
4 . 2
2 . 3
3 . 4
fairly new to postgres, and i have a table with a column named priority. this column should have unique values, and if you attempt to give a row a priority that already exists, it would basically insert it with that priority, and decrement all the priorities that are <= by one to accommodate it.
is there a term for this sort of behavior? i know it will involve a column with unique values, but are there any model constraints i can introduce to enable this sort of behavior? or do i need to manually code an algorithm to do this and account for all edge cases.
I wouldn't store priority as it's own field. Create the table as ID, priority, Date_entered. Then use:
Select ID, rank() over (order by priority, date_entered) as priority
...
I suspect since the rank can change so frequently, calculating it on the fly like this would be preferential to attempting to store the rank and keep it updated.
edit:
There is a logical flaw to this that I can spot already...if record 4 was inserted as priority 2 (so the database contains 2 priority 2 records), there really wouldn't be an easy way to inject ID 5 between ID 4 and 2 without manipulating the date_entered field.
second edit:
Allowing the priority column to be decimal (priority 2 entered, then priority 2.5 entered, and so on), then using the rank() function to resolve that to an integer would get around that. There isn't a pretty answer here that I can find

DAX - selecting rows with partial match

I have a powerpivot table that contains 2 columns:
Column 1 contains strings.
Column 2 contains comma delimited strings.
I would like to be able to display all the rows from column 1 when rows from column 2 contains the selection from a filter or slicer. For example:
String Values
ABCD A,A,B
EFGH A,C
if A is selected I would display both rows, if B is selected I would display only row 1...etc.
I know I can split the records - but this is not practical for me - the above is only the top of the iceberg. VBA is out of the question since this will published in SharePoint. Anybody has an idea on how I could do that ? Thanks.
I found the solution in a blog from Javier Guillem:
http://javierguillen.wordpress.com/2012/02/10/simulating-an-approximate-match-vlookup-in-powerpivot/
If in my example the name of the table is "facts", I create a second unlinked table called dimRef that I populate with all possible values that I am interested to match: A,B,C,D...etc.
Then I define the measure M as:
M:=If ( Hasonevalue(facts[Values] ),
Calculate (
LASTNONBLANK (dimRef[String], 1 ),
Filter ( dimRef, SEARCH(dimRef[String],Values(facts[String]),1,0) > 0 )
)
)
I can then use the string column of the facts table and the measure in a pivot table and use dimRef as a selector. If filters the row as per the selection.
One small detail: the measure is not available in PowerView...Anybody knows why ?