I have a set of source data that looks something like this:
Project Series Paper
Unit 1 1806 1
Unit 1 1806 2
Unit 1 1806 3
Unit 2 1903 1
Unit 2 1903 2
Unit 2 2003 1
Unit 2 2003 2
Unit 2 2103 1
Unit 2 2103 2
Unit 3 1806 1
Unit 3 1906 1
This data normally lives in a database and is huge. In the order of half a million rows.
We also have users that will input a combination of Project, Series and Paper then they will click submit.
Prior to the data being submitted, I would like the data to be validated from the source data and will tell the user if the combination they have entered is valid or not.
Something like this:
Project Series Paper Valid?
Unit 1 1806 1 No
Unit 2 1906 2 Yes
The easiest solution that I can think of is to concatenate the data and do a lookup on each. However, this will create unnecessary heavy load on the database, where a new column will have to be created with half a million rows of data...
I was wondering if there is a loop function in VBA that would check the combination from the source data and let the user know if it is valid or not?
I really appreciate your input.
Ideally, this should be done in your SQL update and check for Primary Key conflicts, but, to do this in Excel, you could use the CountIfs function and check if you have any matches in your dataset.
So, suppose you have your DB table in range, say A1:C500000 and you have your input checker values in cells F2:H2, you could use the following formula for your Valid? in cell I2:
I2: =IF(COUNTIFS(A1:A500000,F2,B1:B500000,G2,C1:C500000,H2)=0,"Yes", "No")
That should do the trick.
Related
I searched a lot, but didn't find an answer to the following question:
Financial data often come as daily data but with missing dates (weekends, banking holidays ...). I would like to have those data really on a daily basis with missing values, where originally the dates were missing.
So far I did this in liberoffice-calc half-manually, which takes a lot of time. I didn't find ways to really automate this, as there is no fixed rule, which dates are missing.
Example:
I have:
21/12/18 1
27/12/18 2
28/12/18 3
02/01/19 4
I want:
21/12/18 1
22/12/18
23/12/18
24/12/18
25/12/18
26/12/18
27/12/18 2
28/12/18 3
29/12/18
30/12/18
31/12/18
01/01/19
02/01/19 4
I'm not familiar with liberoffice-calc. In Excel or Google Sheets, I would use a lookup table.
In one tab of the spreadsheet, I was enter the dates I want in column A. In another tab, I would place the actual data I have. Then, in column B of the first tab, I would lookup the value for that day from the data on the second tab.
Assuming 1 is in B2, put a start date in say D2 and in E2:
=IFERROR(INDEX(B:B,MATCH(D2,A:A,0)),"")
then copy both down to suit.
i have a Datagrid that stores the number pencils produced each day of the month it looks like this:
Pencil | day 1 | day 2 | day 3 | ... |day 31
Red 0 0 13 0 0
blue 5 1 0 8 0
yellow 0 9 5 0 0
I need to save this data into SQL table but im not sure what's the most efficent way to design the table in SQL.
I was thinking about creating a table in SQL with the fields:
pencilmodel
date
quantity
and then in vb.net making a loop that saves 1 by 1 each cell of the datagrid in to the table, but i dont think this is the best way since i will have like 30 rows and a month has 31 days max so it will be 30*31= 930 times.
Im using VB.net and SQL Server
i would create the table that way (as you suggested):
ID | pencilmodel | ProducedDate | Quantity
1 blue dd-mm-yyyy 7
2 red dd-mm-yyyy 4
3 yellow dd-mm-yyyy 6
also, dont loop and insert each row to database, its not efficient, add it to a dataset first and then update it using DataAdapter.Update or bind a dataset to the datagrid view:
How to: Bind Data to the Windows Forms DataGridView Control
I dont know if this one is relevant but why dont you create a fields based on the date and time? lets say like this in your PC
12/14/2016
You can create a program that will create a field for you everyday for example when the day passes by then add a column look like this.
__________________________________
|12/14/2016|12/15/2016|12/15/2016|
so what will happen is you dont need to loop in DGV you just do your INSERT COMMAND
you just need some modifications and validations in here like
if Date_Has_Been_Changed then
Create Table Add Columns
End If
Excel build-in functions are, at most of the time, effective. However, there are some functions really like implemented half-way and some how dictated their usage. The SUBTOTAL function is one of them.
My data is presented in the following format:
Value Count
100 20
102 3
105 4
102 5
And I want to build a table in this format:
Value Count
100 20
101 0
102 8
103 0
104 0
105 4
I've read this in SO but my situation is a bit differ. Pivot table will be able to give you the subtotals of the values appears in the original data and I don't want to have a loop to insert missing values in the original data (if it is gonna to be a loop over the original data, the loop could use to build the table - which I would prefer to avoid at all)
I have a google spreadsheet for my gaming information. It contains 2 sheets - one for monster information, another for team.
Monster information sheet contains the attack value, defend value, and the mana cost of monsters. It's almost like a database of monsters that I can summon.
Team sheet does the following:
Asks for the amount of mana I currently have.
Computes a list of up to 5 monsters that I can summon (it can be less than 5).
Each monster has their own mana cost, therefore total mana cost mustn't exceed the amount of mana I have given in point 1.
The tabulated list should give me a team that have the highest combined attack value. It does not matter how many monsters are summoned. Each monster cannot be summoned twice though.
I have been thinking of using query() function so that I can make use of SQL statements. (so that I can hopefully retrieve the tabulated list directly)
Sample: Monster Info
A B C D
1 Monster Attack Defense Cost
2 MonA 1200 1200 35
3 MonB 1400 1300 50
... ...
Sample: Team
A B C D
1 Mana 120
2
3 Attack Team
4 Monster Attack Cost Total Attack
5 MonB 1400 50 1400
6 MonA 1200 35 2600
7 ... ...
I have these formula in "Team" sheet
A5: =query('Monster Info'!$A$:$D,"SELECT A,B,D ORDER BY B DESC LIMIT 5")
B5: =CONTINUE(A5, 1, 2)
C5: =CONTINUE(A5, 1, 3)
D5: =C5
A6: =CONTINUE(A5, 2, 1)
B6: =CONTINUE(A5, 2, 2)
C6: =CONTINUE(A5, 2, 3)
D6: =D5+C6
That only gets the 5 best attack monsters, regardless of the mana cost consideration. How do I do that such that it takes consideration of both attack value and mana cost value? There is another problem shown in the example below:
Example: (simplified version, without defense value etc)
Monster Attack Cost
MonA 1400 50
MonB 1200 35
MonC 1100 30
MonD 900 25
MonE 500 20
MonF 400 15
MonG 350 10
MonH 250 5
If I have 160 mana, then the obvious team is A+B+C+D+E (5100 Attack).
If I have 150 mana, it becomes A+B+C+D+G (4950 Attack).
If I have 140 mana, it becomes A+B+C+D (4600 Attack).
If I have 130 mana, it becomes B+C+D+E+F (4100 Attack using 125 mana) or A+B+C+F (4100 Attack using all 130 mana).
If I have 120 mana, it becomes B+C+D+E+G (4050 Attack).
If I have 110 mana, it becomes B+C+D+F+H (3850 Attack).
As you can see, there isn't really a pattern within the results.
Any expert willing to share their insights on this?
I've played with the problem for an hour and I only have a workaround here. Your problem seems to be a standard linear programming task which should can easily be solved by a "Solver" software. There used to be a so called "Solver" in google spreadsheet, but unfortunately it was removed from the newest version. If you are not insisting on Google solution, you should try it in one of the Solver-supported spreadsheet manager softwares.
I tried MS Office (it has a Solver add-in, installation guide: http://office.microsoft.com/en-001/excel-help/load-the-solver-add-in-HP010342660.aspx).
Before you run the solver, you should prepare your original dataset a bit, with helper columns and cells.
Add a new column next to the "Cost" column (let's assume it is column "D"), and under it put each row either 0, or 1. This column will tell you if a monster is selected to the attack team or not.
Add two more columns ("E" and "F" respectively). These columns will be products of the Attack and of the Cost respectively. So you should write a function to the E2 cell: =b2*d2, and for the F2 cell: =c2*d2. With this way if a monster is selected (which is told by the D column, remember), the appropriate E and F cells will be non zero values, aotherwise they will be 0.
Create a SUM row under the last row, and create a summarizing function for the D,E,F columns respectively. So in my spreadsheet D10 cell gets its value like this: =sum(d2:d9), and so on.
I created a spreadsheet to show these steps: https://docs.google.com/spreadsheets/d/1_7XRlupEEwat3CthSSz8h_yJ44MysK9hMsj0ijPEn18/edit?usp=sharing
Remember to copy this worksheet to an MS Office worksheet, before you start the Solver.
Now, you are ready to start the Solver. (Data menu, Solver in MS Office). You can see a video here on using the Solver: https://www.youtube.com/watch?v=Oyc0k9kiD7o
It's not that hard as it looks like, but for this case I'll describe what to write where:
Set Objective: you should select the "E10" cell, as that represents the sum of all the attack points.
Check "Max" radiobutton as we would like to maximize the value of the attacks.
By Changing variable cells: Select the "d2:d9" interval as those cells are representing whether a monster is selected or not. The solver will try to adjust these values (0, or 1) in order to maximise the sum attack.
Subject to the Contraints: Here we should add some constraints. Click on the Add button, and then:
First we should ensure that d2:d9 are all binary values. So "Cell reference" should be "d2:d9" and from the dropdown menu, select "bin" as binary.
Another constraint should be that the sum of the selected monsters should not exceed 5. So select the cell where the sum of the selected monsters is represented (D10) and add "<=" and the value "5"
Finally we cannot use more manna that we have, so select the cell in which you store the sum of used manna (F2), and "<=", and add the whole amount of manna we can spend in my case it's in the I2 cell).
Done. It should work, in my case it worked at least.
Hope it helps anyway.
Below is some data:
Test Day1 Day2 Score
A 1 2 100
B 1 3 62
C 3 4 90
D 2 4 20
E 4 5 80
I am trying to take the values from column 'day' and 'day2' and use them to select the row number for the column score. For example for Test A I would like to find the sum of 100 and 62 because that is the values of the first and second rows of score. Test B I would like to find the sum of 100, 62 and 90.
Is their anyway to do this in the Compute Variable window? Found in the menu Transform-Compute Variable?
I tried the following:
Score(MEAN(VALUE(Day1), VALUE(DAY2)))
This is not the proper way to call the cell location of Score and I received an error.
Can anyone help?
Thank you!
You really have two different datasets here. One is a dataset of scores numbered 1 through 5.
The other is a dataset that includes indexes into the score dataset. So the steps would be something like this.
First take the scores dataset and transpose it so that it has one row and 5 columns (Data>Transpose)
Then match that dataset to each case in the main dataset (Data>Merge Files>Add Variables).
Next you have to resort to using syntax directly.
You would declare a vector for the scores (VECTOR)
Finally, you use COMPUTE to index into the scores.
For your real problem, I suppose that you might have batches of scores and maybe there are some gaps. The Restructure Data Wizard can help you generalize this - convert cases into variables, but let's not go there yet.
HTH,
Jon Peck