Sort Excel Grouped Rows - vba

I have a spreadsheet that has information in groups. The header row contain company names and information and then the grouped rows beneath them contain names of people in the company.
Company Name | Number of Employees | Revenue |
Employee Name | Email | Phone
Is there anyway to sort by the number of employees and/or revenue and keep the grouped employee information below the company with the information?
Normally when I try it, it will sort the company information but keep the employee information in the order that it is entered.

If I understand your question correctly, I have a way you can accomplish what you want (don't know if there is a more efficient method).
Write code which will, for each company header row, copy the number of employess and revenue data into two of the chosen unused columns. The data needs to be copied into the columns for both the header company row and detail employee rows.
In the third column assign a sequence number. This is to keep data together and in order when sorting by employee/revenue.
Now you can sort by either the newly created number of employees and/or revenue columns (along with the sequence column to maintain ordering within company).
After the sort you can delete the extra copied data rows.
So if your data looked like this to start with...
A B C
Penetrode 200 750000
Micheal Bolton mbolton#pene.com 555-555-3333
Samir N samirn#pene.com
Initech 500 500000
Bill Lumbergh umumyeah#init.com 555-555-1212
Peter Gibbons pgibbons#init.com 555-555-2222
Your code would then copy the employee count and revenue data and sequencify the rows using three unused columns.
A B C D E F
Penetrode 200 750000 200 750000 1
Micheal Bolton mbolton#pene.com 555-555-3333 200 750000 2
Samir N samirn#pene.com 555-555-3334 200 750000 3
Initech 500 500000 500 500000 4
Bill Lumbergh umumyeah#init.com 555-555-1212 500 500000 5
Peter Gibbons pgibbons#init.com 555-555-2222 500 500000 6
Then you can code a sort on any of the column combos: (D,F), (E,F), (D,E,F), or (E,D,F)

Better late than never, I suppose, but I feel my LAselect plugin would have solved your problem. I created this plugin because I do much non-standard 'stuff' with my data and needed a tool to handle it. LAselect can produce your 'group' output too and you would not need hidden columns or anything. I mean, you would not need to change the screens you are used to to sort them in whatever way you wanted.

Related

I am wondering if there is an elegant way to apply either a combination of query, Arrayformula, sort, functions in Google Sheets to do the following

Google Sheets Problem. I have a master list that has columns which are employers, job post, # of spots, parameter x, parameter y,...etc.
"Master Sheet" #a tab
Employers Job Spots
John Cleaner 1
Mike Cleaner 2
John Cleaner 3
John Server 5
Alice Cook 1
Dave Cook 1
Mary Cleaner 3
Alice Server 5
Alice Cleaner 2
Dave Server 4
Mike Server 3
Alice Server 1
This is what I would like "Output Sheet" #another tab with two columns. 1st is Jobs and 2nd is # of employers that account for 80% of the jobs in that category plus any additional filters. The idea is to give a single # that gives an 80/20 rule type metric. The trick is to Sort one column from highest to lowest first. I can do this but in multiple steps that seem annoyingly inefficient. I wonder if there is a better way where I can put everything in one cell and drag down or do a query function. The output looks like below.
Job # of employers that account for ~80% of all the jobs in that category + filters
Cleaner ~3
Cook 1
Server ~3
#because total Cleaner jobs is 11. 80% is 8.8. And sorting employers highest to lowest (after accounting for duplicates), 3 employers represent 80% of the Cleaner jobs available. Server total is 21, 80% is 16.8, so ~3 employers represent 80% of the Server jobs available.
Thank you all for your help.
To take 80%:
=query(A15:C26, "Select B, sum(C)*8/100 group by B label B 'Job'")
you will get
{0.88, 0.16, 1.44)
But the next you can continue by yourself

Counting number of occurences of tuples in an m:n relationship

I'd like to know if there's an efficient way to count the number of occurences of a permutation of entities from one side of the m:n relationship. Hopefully, the next example will illustrate properly what I mean:
Let's imagine a base with people and events of some sort. People can organize multiple events and events can be organized by more than one person. What i'd like to count is whether a certain tuple of people have already organized an event or if it's their first time. My first idea to do this is to add an attribute to the m:n relationship
PeopleID | EventID | TimesOrganized
100 1 1
200 1 1
300 2 1
400 3 1
Now, there's an event no. 4 that's again organized by persons 200 and 100 (let's say they should be added in that order). The new table should look like:
PeopleID | EventID | TimesOrganized
100 1 2
200 1 2
300 2 1
400 3 1
200 4 2
100 4 2
Now, if I added an event organized by persons 200 and 300 it would look like this:
PeopleID | EventID | TimesOrganized
100 1 2
200 1 2
300 2 1
400 3 1
200 4 2
100 4 2
200 5 1
300 5 1
How would I go about keeping the third column updated properly and what are my options?
I should also add that this a part of the larger project we have for one of the classes and we'll be implementing an application that uses the database in some way, so I might as well move this to application logic if there's no easy way.
I wouldn't recommend tracking a TimesOrganized column as you suggest.
You can simple query it as needed using a COUNT(EventId)..GROUP BY PeopleID.
If you do feel you need to maintain the value somewhere it probably is better normalized to the (presumed) table People. Something like People.TimesOrganized. But then you have to increment it as you go instead of just recalculating as needed.
If you want to count how many many time someone have organized an event the problem is not m:n, but 1:m. Just count the event grouped by the people, that's it, you don't really need to have that column in the table, if it's not needed a lot of time.
That said I find you table a little confusing, there are detail and aggregation mixed, the third one downright wrong: the PeopleID 200 had organized 3 event and the 300 have 2 event.

oracle - sql query select max from each base

I'm trying to solve this query where i need to find the the top balance at each base. Balance is in one table and bases are in another table.
This is the existing query i have that returns all the results but i need to find a way to limit it to 1 top result per baseID.
SELECT o.names.name t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY o.names.name, t.accounts.bidd.baseID;
accounts is a nested table.
this is the output
Name accounts.BIDD.baseID MAX(T.accounts.BALANCE)
--------------- ------------------------- ---------------------------
Jerard 010 1251.21
john 012 3122.2
susan 012 3022.2
fin 012 3022.2
dan 010 1751.21
What i want the result to display is calculate the highest balance for each baseID and only display one record for that baseID.
So the output would look only display john for baseID 012 because he has the highest.
Any pointers in the right direction would be fantastic.
I think the problem is cause of the "Name" column. since you have three names mapped to one base id(12), it is considering all three records as unique ones and grouping them individually and not together.
Try to ignore the "Name" column in select query and in the "Group-by" clause.
SELECT t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY t.accounts.bidd.baseID;

Rows to Dynamic columns in Access

I need a setup in Access where some rows in a table are converted to columns...for example, lets say I have this table:
Team Employee DaysWorked
Sales John 23
Sales Mark 3
Sales James 5
And then through the use of a query/form/something else, I would like the following display:
Team John Mark James
Sales 23 3 5
This conversion of rows to columns would have to be dynamic as a team could have any number of Employees and the Employees could change etc. Could anyone please guide me on the best way to achieve this?
You want to create a CrossTab query. Here's the SQL that you can use.
TRANSFORM SUM(YourTable.DaysWorked) AS DaysWorked
SELECT YourTable.Team
FROM YourTable
GROUP BY YourTable.Team
PIVOT YourTable.Employee
Of course the output is slightly different in that the columns are in alphabetical order.
Team James John Mark
Sales 5 23 3
For more detail see Make summary data easier to read by using a crosstab query at office.microsoft.com

Pulling items out of a DB with weighted chance

Let's say I had a table full of records that I wanted to pull random records from. However, I want certain rows in that table to appear more often than others (and which ones vary by user). What's the best way to go about this, using SQL?
The only way I can think of is to create a temporary table, fill it with the rows I want to be more common, and then pad it with other randomly selected rows from the table. Is there a better way?
One way I can think of is to create another column in the table which is a rolling sum of your weights, then pull your records by generating a random number between 0 and the total of all your weights, and pull the row with the highest rolling sum value less than the random number.
For example, if you had four rows with the following weights:
+---+--------+------------+
|row| weight | rollingsum |
+---+--------+------------+
| a | 3 | 3 |
| b | 3 | 6 |
| c | 4 | 10 |
| d | 1 | 11 |
+---+--------+------------+
Then, choose a random number n between 0 and 11, inclusive, and return row a if 0<=n<3, b if 3<=n<6, and so on.
Here are some links on generating rolling sums:
http://dev.mysql.com/tech-resources/articles/rolling_sums_in_mysql.html
http://dev.mysql.com/tech-resources/articles/rolling_sums_in_mysql_followup.html
I don't know that it can be done very easily with SQL alone. With T-SQL or similar, you could write a loop to duplicate rows, or you can use the SQL to generate the instructions for doing the row duplication instead.
I don't know your probability model, but you could use an approach like this to achieve the latter. Given these table definitions:
RowSource
---------
RowID
UserRowProbability
------------------
UserId
RowId
FrequencyMultiplier
You could write a query like this (SQL Server specific):
SELECT TOP 100 rs.RowId, urp.FrequencyMultiplier
FROM RowSource rs
LEFT JOIN UserRowProbability urp ON rs.RowId = urp.RowId
ORDER BY ISNULL(urp.FrequencyMultiplier, 1) DESC, NEWID()
This would take care of selecting a random set of rows as well as how many should be repeated. Then, in your application logic, you could do the row duplication and shuffle the results.
Start with 3 tables users, data and user-data. User-data contains which rows should be prefered for each user.
Then create one view based on the data rows that are prefered by the the user.
Create a second view that has the none prefered data.
Create a third view which is a union of the first 2. The union should select more rows from the prefered data.
Then finally select random rows from the third view.