Compile all values across multiple delimited strings into a table - sql

I am collecting responses to an online survey form in a table like this:
CREATE TABLE [Survey]
(
ID int IDENTITY(1,1) NOT NULL,
UserName varchar(50) NOT NULL,
Responses varchar(max) NOT NULL,
Taken datetime NOT NULL
)
When the user clicks the submit button, a process grabs all the checkboxes that were clicked and concatenates their names into a delimited string, and stuffs that into the table along with the other fields. Essentially same as:
INSERT INTO [Survey] (UserName, Responses, Taken) VALUES ('John', 'chkSize', GetDate())
INSERT INTO [Survey] (UserName, Responses, Taken) VALUES ('Mary', 'chkSquare;chkSoft', GetDate())
INSERT INTO [Survey] (UserName, Responses, Taken) VALUES ('Steve', 'chkSize;chkYellow;chkRound', GetDate())
INSERT INTO [Survey] (UserName, Responses, Taken) VALUES ('April', 'chkRound;chkStacked;chkFiltered;chkBrown', GetDate())
Is there a way to easily go through all the "Responses" for the whole table, find all possible values, and then return them as a Unique list in their own table? i.e.:
chkBrown
chkFiltered
chkRound
chkSize
chkSoft
chkSquare
chkStacked
chkYellow

You can do what you want using string_split():
select s.value, count(*)
from survey su cross apply
string_split(su.responses, ';') s
group by s.value;
Here is a db<>fiddle.
The fact that you can do this does not mean that you should. You should store the responses in a separate table, with one row per response.

If this is just a simple one page, check box only survey, one approach is
// declare flags enum
[Flags]
public enum Checkboxes : int
{
none = 0,
chkBrown = 1,
chkFiltered = 2,
chkRound = 4,
chkSize = 8,
chkSoft = 16,
chkSquare = 32,
chkStacked = 64,
chkYellow = 128
}
// on initialize/constructor add these values to your checkbox tag
chkBrown.Tag = Checkboxes.chkBrown ;
// Add checkbox extension
public shared Checkboxes GetCode(this Checkbox cb)
{
if (cb.Checked)
return (Checkboxes)cb.Tag;
return Checkboxes.none;
}
// your db value would be
Checkboxes val = chkBrown.GetCode() | chkFiltered() . . . // list all c-boxes here
// make db field integer and save this value:
(int)val
But... again. This is only good if no changes will be required and the system is static. This seem to be homework and no long term issue. But in such cases, long term scenario is many-to-many table, where you can have multiple records for same question posted as separate record. This way, SQL search is easy
Here is working fiddle where you can also see how to set your checkbox to a value retrieved from the number
using System;
public class ClsVal// instead of checkbox
{
public bool A {get; set;}
public Checkboxes C {get; set;}
}
public static class ClsValExt
{
public static Checkboxes GetCode(this ClsVal cb)
{
if (cb.A)
return (Checkboxes)cb.C;
return Checkboxes.none;
}
}
[Flags]
public enum Checkboxes : int
{
none = 0,
chkBrown = 1,
chkFiltered = 2,
chkRound = 4,
chkSize = 8,
chkSoft = 16,
chkSquare = 32,
chkStacked = 64,
chkYellow = 128
}
public class Program
{
public static void Main()
{
var c1 = new ClsVal() {A = true, C = Checkboxes.chkBrown};
var c2 = new ClsVal() {A = true, C = Checkboxes.chkFiltered};
var c3 = new ClsVal() {A = false, C = Checkboxes.chkRound};
var c4 = new ClsVal() {A = true, C = Checkboxes.chkSize};
var x = c2.GetCode() | c1.GetCode() | c3.GetCode() | c4.GetCode();
var i = (int)x;
Console.WriteLine(i);
Console.WriteLine((x & Checkboxes.chkBrown) == Checkboxes.chkBrown); //Yes
Console.WriteLine((x & Checkboxes.chkBrown) == Checkboxes.chkYellow); // No
}
}

Related

Android Studio SQLite Column Sum

I have a table with several columns and I'm looking to create a function that will return the sum of all the entries in a particular column.
For debug purposes, I've simply set the return string to "1" to confirm the rest of my code works properly, which it does.
Can anyone help me with the necessary code to sum the column and return the value?
public String TotalServings(){
SQLiteDatabase db = getReadableDatabase();
//From the table "TABLE_PRODUCTS" I want to sum all the entries in the column "COLUMN_SERVINGS"
String Servings = "1";
return Servings;
}
Here's what you need:
//This method returns an int, if u need it string parse it
public Int TotalServings(){
int serving =0;
SQLiteDatabase db = getReadableDatabase();
Cursor cursor = db.rawQuery(
"SELECT SUM(COLUMN_SERVINGS) FROM TABLE_PRODUCTS", null);
if(cursor.moveToFirst()) {
serving = cursor.getInt(0);
}
return serving;
}

Apache Pig: strip namespace prefix (::) after group operation

A common pattern in my data processing is to group by some set of columns, apply a filter, then flatten again. For example:
my_data_grouped = group my_data by some_column;
my_data_grouped = filter my_data_grouped by <some expression>;
my_data = foreach my_data_grouped flatten(my_data);
The problem here is that if my_data starts with a schema like (c1, c2, c3) after this operation it will have a schema like (mydata::c1, mydata::c2, mydata::c3). Is there a way to easily strip off the "mydata::" prefix if the columns are unique?
I know I can do something like this:
my_data = foreach my_data generate c1 as c1, c2 as c2, c3 as c3;
However that gets awkward and hard to maintain for data sets with lots of columns and is impossible for data sets with variable columns.
If all fields in a schema have the same set of prefixes (e.g. group1::id, group1::amount, etc) you can ignore the prefix when referencing specific fields (and just reference them as id, amount, etc)
Alternatively, if you're still looking to strip a schema of a single level of prefixing you can use a UDF like this:
public class RemoveGroupFromTupleSchema extends EvalFunc<Tuple> {
#Override
public Tuple exec(Tuple input) throws IOException {
Tuple result = input;
return result;
}
#Override
public Schema outputSchema(Schema input) throws FrontendException {
if(input.size() != 1) {
throw new RuntimeException("Expected input (tuple) but input does not have 1 field");
}
List<Schema.FieldSchema> inputSchema = input.getFields();
List<Schema.FieldSchema> outputSchema = new ArrayList<Schema.FieldSchema>(inputSchema);
for(int i = 0; i < inputSchema.size(); i++) {
Schema.FieldSchema thisInputFieldSchema = inputSchema.get(i);
String inputFieldName = thisInputFieldSchema.alias;
Byte dataType = thisInputFieldSchema.type;
String outputFieldName;
int findLoc = inputFieldName.indexOf("::");
if(findLoc == -1) {
outputFieldName = inputFieldName;
}
else {
outputFieldName = inputFieldName.substring(findLoc+2);
}
Schema.FieldSchema thisOutputFieldSchema = new Schema.FieldSchema(outputFieldName, dataType);
outputSchema.set(i, thisOutputFieldSchema);
}
return new Schema(outputSchema);
}
}
You can put the 'AS' statement on the same line as the 'foreach'.
i.e.
my_data_grouped = group my_data by some_column;
my_data_grouped = filter my_data_grouped by <some expression>;
my_data = FOREACH my_data_grouped FLATTEN(my_data) AS (c1, c2, c3);
However, this is just the same as doing it on 2 lines, and does not alleviate your issue for 'data sets with variable columns'.

Get a value from array based on the value of others arrays (VB.Net)

Supposed that I have two arrays:
Dim RoomName() As String = {(RoomA), (RoomB), (RoomC), (RoomD), (RoomE)}
Dim RoomType() As Integer = {1, 2, 2, 2, 1}
I want to get a value from the "RoomName" array based on a criteria of "RoomType" array. For example, I want to get a "RoomName" with "RoomType = 2", so the algorithm should randomize the index of the array that the "RoomType" is "2", and get a single value range from index "1-3" only.
Is there any possible ways to solve the problem using array, or is there any better ways to do this? Thank you very much for your time :)
Note: Code examples below using C# but hopefully you can read the intent for vb.net
Well, a simpler way would be to have a structure/class that contained both name and type properties e.g.:
public class Room
{
public string Name { get; set; }
public int Type { get; set; }
public Room(string name, int type)
{
Name = name;
Type = type;
}
}
Then given a set of rooms you can find those of a given type using a simple linq expression:
var match = rooms.Where(r => r.Type == 2).Select(r => r.Name).ToList();
Then you can find a random entry from within the set of matching room names (see below)
However assuming you want to stick with the parallel arrays, one way is to find the matching index values from the type array, then find the matching names and then find one of the matching values using a random function.
var matchingTypeIndexes = new List<int>();
int matchingTypeIndex = -1;
do
{
matchingTypeIndex = Array.IndexOf(roomType, 2, matchingTypeIndex + 1);
if (matchingTypeIndex > -1)
{
matchingTypeIndexes.Add(matchingTypeIndex);
}
} while (matchingTypeIndex > -1);
List<string> matchingRoomNames = matchingTypeIndexes.Select(typeIndex => roomName[typeIndex]).ToList();
Then to find a random entry of those that match (from one of the lists generated above):
var posn = new Random().Next(matchingRoomNames.Count);
Console.WriteLine(matchingRoomNames[posn]);

Aggregate replace in SQL Server?

What I'm trying to achieve is to make dynamic a series of replacements that have to be performed on a certain field. (To make things even easier, I want in fact to remove data, so I'll be always comparing with
Say that sometimes I will have to do just one replacement:
... REPLACE(myField, stringToRemove, '')
Sometimes, I will need two replacements:
... REPLACE(REPLACE(myField, stringToRemove, ''), anotherStringToRemove, '')
However, I need to make this dynamic and I do not know in advance how many of those values I'll have, and so, how many replacements (removals) I'll have to do.
I tried searching for aggregate string manipulation functions and, of course, there's none. I also know that this can be achieved through a CLR aggregate function but I don't have the possibility of using it.
Any ideas?
You can setup a table variable with FromValue and ToValue and use a while loop to do the replacements.
-- Table to replace in
declare #T table
(
Value varchar(50)
)
insert into #T values
('first second third'),
('first second third')
-- Table with strings to replace
declare #Rep table
(
ID int identity primary key,
FromValue varchar(50),
ToValue varchar(50)
)
insert into #Rep values
('second', 'fourth'),
('third', 'fifth')
declare #ID int
select #ID = max(ID)
from #Rep
while #ID > 0
begin
update #T
set Value = replace(Value, FromValue, ToValue)
from #Rep
where ID = #ID
set #ID -= 1
end
select *
from #T
Result:
Value
-------------------
first fourth fifth
first fourth fifth
If you only want to query the values you can do something like this.
;with C as
(
select 0 as ID,
Value,
0 as Lvl
from #T
union all
select R.ID,
cast(replace(C.Value, R.FromValue, R.ToValue) as varchar(50)),
Lvl + 1
from #Rep as R
inner join C
on C.ID + 1 = R.ID
)
select top 1 with ties Value
from C
order by Lvl desc
Once you implement the CLR aggregate function below, you can do:
SELECT dbo.ReplaceAgg(t.[text], w.badword, w.goodword) // call CLR aggregate function
FROM [Texts] t CROSS JOIN BadWords w
GROUP BY t.[text]
CLR aggregate function in C#
/// <summary>
/// Allows to apply regex-replace operations to the same string.
/// For example:
/// SELECT dbo.ReplaceAgg(t.[text], w.badpattern, "...")
/// FROM [Texts] t CROSS JOIN BadPatterns w
/// GROUP BY t.[text]
/// </summary>
[Serializable]
[Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.UserDefined,
IsInvariantToDuplicates = true, IsInvariantToOrder = false,
IsInvariantToNulls = true, MaxByteSize = -1)]
public class RegexReplaceAgg : IBinarySerialize
{
private string str;
private string needle;
private string replacement;
public void Init()
{
str = null;
needle = null;
replacement = null;
}
public void Accumulate(SqlString haystack, SqlString needle, SqlString replacement)
{
// Null values are excluded from aggregate.
if (needle.IsNull) return;
if (replacement.IsNull) return;
if (haystack.IsNull) return;
str = str ?? haystack.Value;
this.needle = needle.Value;
this.replacement = replacement.Value;
str = Regex.Replace(str, this.needle, this.replacement, RegexOptions.Compiled | RegexOptions.CultureInvariant);
}
public void Merge(RegexReplaceAgg group)
{
Accumulate(group.Terminate(), new SqlString(needle), new SqlString(replacement));
}
public SqlString Terminate() => new SqlString(str);
public void Read(BinaryReader r)
{
str = r.ReadString();
needle = r.ReadString();
replacement = r.ReadString();
}
public void Write(BinaryWriter w)
{
w.Write(str);
w.Write(needle);
w.Write(replacement);
}
}
You might have to write a scalar function to which you pass the original string, and enough information for it to know which strings to remove, and have it loop through them and return the result of the set of replacements.

How to construct NHibernate criteria to find parents who has all specified children

A Project can have many Parts. A property on Part is Ipn, which is a string of digits.
Project "A" has Parts "1", "2", "3"
Project "B" has Parts "2", "3", "4"
Project "C" has Parts "2"
Project "D" has Parts "3"
I want to find all Projects that have all of the specified parts associated with it. My current query is
var ipns = new List<String> { "2", "3" }
var criteriaForIpns = DetachedCriteria
.For<Part>()
.SetProjection(Projections.Id())
.Add(Expression.In("Ipn", ipns));
_criteriaForProject
.CreateCriteria("Ipns")
.Add(Subqueries.PropertyIn("Id", criteriaForIpns));
This gives me back all Projects that have any of the parts, thus the result set is Projects A, B, C, and D.
The SQL where clause generated, looks something like
WHERE part1_.Id in (SELECT this_0_.Id as y0_
FROM Parts this_0_
WHERE this_0_.Ipn in ('2' /* #p0 */,'3' /* #p1 */))
My desired result would only be Projects A and B. How can I construct the NHibernate criteria to get the result set that I need?
The number of parts I search on can vary, it can be n number of parts.
yesterday I was working on the similar problem.
I had to select/load all parent-objects with exactly the given list of child-objects.
I could solve this with the Criteria-API, with only one drawback (see *1 below).
public class Project
{
public virtual int ProjectId{get;set;}
public virtual IList<Part> Parts{get;set;}
...
}
public class Part
{
public virtual int PartId{get;set;}
public virtual Project Project{get;set;} // *1 this is the drawback: I need a public property for the ForegienKey from the child to the parent
...
}
Here comes the Criteria:
DetachedCriteria top = DetachedCriteria.For<Project>();
foreach(Part part in searchedParts)
{
DetachedCriteria sub = DetachedCriteria.For<Part>();
sub.Add(Expresion.Eq("PartId",part.PartId));
sub.SetProjection("Project");
top.Add(Subqueries.PropertyIn("ProjectId",sub));
}
Back to your example: The SQL would look like this.
SELECT * FROM project
WHERE
projectid IN ( SELECT projectid FROM part WHERE partid = 1 /* #p0 */ )
AND projectid IN ( SELECT projectid FROM part WHERE partid = 2 /* #p1 */ )
Basicaly I add for each child a subquery that checks for it's existance in the project and combine them with and, so only project with all that children will be selected.
Greetings
Juy Juka
Additional Uses
I wasn't finished with my code after this and if somone needs what I had to find out, I'll add it here. I hope the additional information belongs here, but I am not sure because it's my first post on stackoverflow.com
For the following examples we need a more complex part-class:
public class Part
{
public virtual int PartId{get;set;}
public virtual Project Project{get;set;}
public virtual PartType PartType{get;set;}
...
}
public class PartType
{
public virtual int PartTypeId{get;set;}
public virtual string Name{get;set;}
...
}
Different criterion on child-objects
It is possible to use the same code when you do not have the primarykey(s) of the searched parts, but would like to find the parts with other properties.
// I am asuming building-projects with houses, gardens, garages, driveways, etc.
IEnumerable<PartType> searchedTypes = new PartType[]{housePart, gardenPart};
// could be a parameter or users choise or what ever
DetachedCriteria top = DetachedCriteria.For<Project>();
foreach(PartType type in searchedTypes)
{
DetachedCriteria sub = DetachedCriteria.For<Part>();
sub.Add(Expresion.Eq("PartType",type)); // this is all that had to be changed. We could even use more complex operations with and, or, not, etc.
sub.SetProjection("Project");
top.Add(Subqueries.PropertyIn("ProjectId",sub));
}
Expected SQL
SELECT * FROM project
WHERE
projectid IN ( SELECT projectid FROM part WHERE parttype = 1 /* #p0 // aka. housePart */ )
AND projectid IN ( SELECT projectid FROM part WHERE parttype = 2 /* #p1 // aka. gardenPart */ )
Excluding children
To negate this and search partens who do not have the searched children is easily done by using Subqueries.PropertyNotIn instead of Subqueries.PropertyIn.
Exactly/only the searched children
This was the tricky part I had to work on the longest time. I wanted parents with exactly the given list of parts.
To stay with the building-project example: I am searching projects with a house-part and a guarden-part but no other parts
IEnumerable<PartType> searchedTypes = new PartType[]{housePart, gardenPart};
DetachedCriteria top = DetachedCriteria.For<Project>();
ICriterion notCriterion = null;
foreach(PartType type in searchedTypes)
{
ICriterion subCriterion = Expresion.Eq("PartType",type);
DetachedCriteria sub = DetachedCriteria.For<Part>();
sub.Add(subCriterion);
sub.SetProjection("Project");
top.Add(Subqueries.PropertyIn("ProjectId",sub));
// I am collecting all valid criterions for child-objects and negate them
subCriterion = Expresion.Not(subCriterion);
notCriterion = notCriterion == null ? subCriterion:Expresion.And(notCriterion,subCriterion);
}
// with the negated criterions I exclude all parent-objects with an invalid child-object
DetachedCriteria not = DetachedCriteria.For<Part>();
not.Add(notCriterion);
sub.SetProjection("Project");
top.Add(Subqueries.PropertyNotIn("ProjectId",not));
Expected SQL
SELECT * FROM project
WHERE
projectid IN ( SELECT projectid FROM part WHERE parttype = 1 /* #p0 // aka. housePart */ )
AND projectid IN ( SELECT projectid FROM part WHERE parttype = 2 /* #p1 // aka. gardenPart */ )
AND projectid NOT IN ( SELECT projectid FROM part
WHERE
NOT ( parttype = 1 /* #p2 // aka. housePart */ )
AND NOT ( parttype = 2 /* #p3 // aka. gardenPart */ )
)
(More then one house and/or one guarden is possible, since no checkon "duplicated" entries is done)
Your query requires that we make two joins from Project to Part. This is not possible in Criteria.
HQL
You can express this query directly in HQL.
var list = session.CreateQuery( #"
select proj from Project proj
inner join proj.Parts p1
inner join proj.Parts p2
where p1.Id=:id1
and p2.Id=:id2
" )
.SetInt32( "id1", 2 )
.SetInt32( "id2", 3 )
.List<Master>();
Criteria
With the Criteria API, you would query for those Projects that have one of the specified Parts, and the filter the results in C#.
Either have the criteria eager load Project.Parts, or map that as lazy="extra".
Then, using your existing criteria query from above.
// Load() these if necessary
List<Parts> required_parts;
var list = _criteriaForProject.List<Project>()
.Where( proj => {
foreach( var p in required_parts ) {
if (!proj.Parts.Contains( p ))) {
return false;
}
return true;
}
});
// if _criteriaForProject is a Detached Criteria, that would be:
var list = _criteriaForProject.GetExecutableCriteria( session )
.List<Project>()
.Where( // etc