Why are my final sample sizes different on SPSS compared to PROCESS MACRO - process

I’m doing a moderation analysis in SPSS. I recoded all my variables and coded inapplicable, don’t know, doesn’t apply as -99 so they would be considered as missing variables and not applied to the analysis. When I run the data through PROCESS MACRO, I get a sample size of 680. When I run it through SPSS, I get 768. I also centered the variables on PROCESS and SPSS
Not sure what I’m doing wrong and why I’m getting two different sample sizes.
Any suggestions?
Tried running frequency analysis but not getting same results as PROCESS. Not sure if I’m doing something wrong there.

Related

Calculating Production chain using a database (factorio)

I'm playing the game Factorio, where you build a factory.
For the time being, I made a kind-of flowchart using libreoffice calc to calculate how many machines I need to produce a certain material.
Example image from the spreadsheet
Each block has a recipe saved (blue). This recipe includes what and how much it produces and needs and how much time it takes.
It takes the demand from the previous Block (yellow) and, using the recipe, calculates how many machines (green) it needs to fulfill this demand.
Based on the amount of machines it calculates its own demands (orange).
Then the following blocks do the same, until it has reached the last block.
Doing this in a spreadsheet does work, but it is quite a tedious task.
I showed this to my dad, as I'm quite proud of what I made, and he said that maybe a database would be more suitable.
I definitely see its advantages. For example I could easily summarize the final demands of raw resources, or the total power consumption, etc.
So I got myself Microsoft Access, and I'm pretty lost now. I know the basics of Databases and some SQL-Coding, but I'm not quite sure how I would make this.
My first attempt was:
one table for machines. It includes the machines production speed and other relevant stats.
one table for recipes. Each recipe clearly states what it produces, what it needs, the amount of each, and whether or not it is a basic. Basic means that it is a raw resources, i.e. the production chain would end with this.
one table for units. Each unit has a machine, a recipe and an amount. For example I would have one unit using basic assemblers to produce iron gears. This unit also says how many machines there are, so it needs more and produces more.
I did manage to make a query that calculates the total in and outputs of all units based on their machine and recipe, as well as a total energy consumption.
However, that is nowhere near the spreadsheet I made.
For now we can probably set the Graphical overlay aside, that would probably be quite a bit overkill. However what I do want to be able to make:
enter how much I want of a certain resources
based on that entry the database would create a new table. The first entry would be the unit that produces the requested resources. The second would fulfill the firsts demand, the third fulfills the seconds demand, and so on.
So in the end I would end up with a list of units that will produce my requested resource.
I hope someone can help me. There are programs out there that already do this kind of stuff, but I want to do this myself. If this is a problem that a database isn't suited for, then please tell me so.
Thanks for any help!

I'm looking to create an automated numbering system for custom paint by number kits in photoshop

So I know very little about programming all around. I'm adept at photoshop and I'm looking to automate the numbering system for making these paint by number kits. I convert the images into vector format and set a maximum number of color variations. I then use adobe illustrator to create the outlined partitions of the image by color. This is all well and good, it's automated and efficient as far as I need.
My dilemma is that I do not have a system that can number these partitions in a clear and uniform fashion. I must do this tediously in photoshop, taking hours to finish.
I am looking to create or find a system that will do this last step automatically.
My vison for how this would look would be numbers, 1-20 or so depending on the set color cap, evenly distributed across each partition in uniform font and size. The idea is that there would be a grid of 1 number (this number would be the reference to the color needed in this partition) spread across larger partitions and only a few of 1 number on the smaller partitions. It would hopefully look like so:
You can see here how tedious this can become.
I don't know how to accomplish this, but I'm wondering how complicated this process would be in theory and would it be better for me to learn how to do it myself, hire a professional, or continue the hand numbering. It's creating a labor cap on my small business that is preventing me from further growth.
Any and all help is very much appreciated; if I can provide more context or specifications I would be more than happy to do so. Thank you!
Just for fun I've managed to tweak old Johnware's script (Circle Fill). Now it can fill with given letters (numbers for example). It works to a degree, but the result far from ideal:
Probably it can be used for start.
I believe a real programmer could make it way better.
My tweaked version of the script is here: https://disk.yandex.ru/d/Ze4-1DQoNRVF1g
Update
I'm improved the script further. Now it:
works more precise
handles several selected paths
remembers values in the dialog window
sets font size
Here is the is the updated version of the script: https://disk.yandex.ru/d/0pcpLDGrfQKMJA
It took me about 15 minutes to do this:
But I had to to split some complex paths with a Knife tool. Sometimes the script throws a some mystical error. I've just selected another set of paths an run the scripts again and again.
It is not a final result but it's close. I think it's much faster that to do it manually.
It can be done with script to some degree. It will work fine for simply forms. But for complicated forms it will be too hard to calculate where you need to put all numbers and how many number will be enough.
But I saw scripts that can fill any form with any symbols. So it's possible to fill any form with numbers, I think, technically.
Of course, if you aren't a seasoned coder it makes no sense to try to do it at home. You need a pro (not even me).
And I see another very simply options as well:
It doesn't even need a script. What do you think?

How do I handle variability of output in Anylogic?

I have been working on a simulation model for battery swapping in Anylogic. So far I have developed the simulation model, optimization experiment and parameters variation experiment.
There are no errors in the model but the output values are unsatisfactory. Small changes such as changing the step size of the decision variables results in a drastic change in the best value obtained after every experiment. Though the objective does not change much but I am concerned about the other variables that are changing with each run. Even with multiple optimization runs it is difficult to come to a conclusion.
For reference I am posting an output of parameters variation experiment here. I ran the experiment with an optimized value but I was getting feasible results (percentile > 95%) far off the expected input values. Although, the overall result is correct (decreasing percentile with increasing charging time) but it is difficult to understand the variability.
Can anyone help?enter image description here
When building a model, this is a common problem you will have when looking at high level overall outputs. You could have a model bug, but it is just as likely (if not more likely) that there is some dynamic to your system that was not clear in simple Excel spreadsheets or mental models. The DES may be telling us something truly interesting about the system behavior, but without additional outputs, there is no way to understand what that is.
A few suggestions:
Run this as a simple single scenario, where you manually update inputs. When you run this with the low range of input values and then the high range of input values, what do you see on the animation or additional outputs that is different than you expected or could explain the overall output trend? Try running several intermediate points.
Add additional output metrics. If you look at queue sizes, resource utilizations, turn-around-times, etc; do you see anything at that level that is different than expected?
Add a "replication" log. When you run a set of inputs for multiple scenarios, does any single replication stand out as an outlier? If so, re-run the scenario with that set of inputs and that random seed.
There is no substitute for understanding underlying system behavior, and without understanding those dynamics, looking at overall correlation with optimization or parameter variation experiments will often lead companies to make the wrong policies decisions.

Run the same IPython notebook code on two different data files, and compare

Is there a good way to modularize and re-use code in IPython Notebook (Jupyter) when doing the same analysis on two different sets of data?
For example, I have a notebook with a lot of cells doing analysis on a data file. I have another data file of the same format, and I'd like to run the same analysis and compare the output. None of these options looks particularly appealing for this:
Copy and paste the cells to a second notebook. The analysis code is now duplicated and harder to update.
Move the analysis code into a module and run it for both files. This would lose the cell-by-cell format of the figures that are currently generated and simply jumble them all together in one massive cell.
Load both files in one notebook and run the analyses side by side. This also involves a lot of copy-and-pasting, and doesn't generalize well to 3 or 4 different data files.
Is there a better way to do this?
You could lace demo directives into the standalone module, as per the IPython Demo Mode example.
Then when actually executing it in the notebook, you make a call to the demo object wrapper each time you want to step to the next important part. So your cells would mostly consist of calls to that demo wrapper object.
Option 2 is clearly the best for code re-use, it is the de facto standard arguably in all of software engineering.
I argue that the notebook concept itself doesn't scale well to 3, 4, 5, ... different data files. Notebook presentations are not meant to be batch processing receptacles. If you find yourself needing to do parameter sweeps across different data sets, and wanting to re-run analyses on top of the different data loaded for each parameter group (even when the 'parameters' might be as simple as different file names) it raises a bad code smell. It likely means the level of analysis being performed in an 'interactive' way is wrong. Witnessing analysis 'interactively' and at the same time performing batch processing are two pretty much incompatible goals. A much better idea would be to batch process all of the parameter sets separately, 'offline' from the point of view of any presentation, and then build a set of stand-alone functions that can produce visual results from the computed and stored batch results. Then the notebook will just be a series of function calls, each of which produces summary data (some of which could be examples from a selection of parameter sets during batch processing) across all of the parameter sets at once to invite the necessary comparisons and meaningfully present the result data side-by-side.
'Witnessing' an entire interactive presentation that performs analysis on one parameter set, then changing some global variable / switching to a new notebook / running more cells in the same notebook in order to 'witness' the same presentation on a different parameter set sounds borderline useless to me, in the sense that I cannot imagine a situation where that mode of consuming the presentation is not strictly worse than consuming a targeted summary presentation that first computed results for all parameter sets of interest and assembled important results into a comparison.
Perhaps the only case I can think of would be toy pedagogical demos, like some toy frequency data and a series of notebooks that do some simple Fourier analysis or something. But that's exactly the kind of case that begs for the analysis functions to be made into a helper module, and the notebook itself just lets you selectively declare which toy input file you want to run the notebook on top of.

Correctness testing for process modelling application

Our group is building a process modelling application that simulates an industrial process. The final output of this process is a set of number representing chemistry and flow rates.
This application is based on some very old software that uses the exact same underlying mathematical model to create the simulation. Thousands of variables are involved in the simulation.
Although each component has been unit tested, we now need to be able to make sure that the data output produced by our software matches that of the old simulation software. I am wondering how best to approach this issue in a formalised and rigorous manner.
The old program works by specifying the input via a text file, so I was thinking we could programatically take each variable, adjust its value in the file (and correspondingly in our new application), then compare the outputs between the new and old application. We do this for every variable in the model.
We know the allowable range for each variable so I suppose a random sample across each variable of a few values is enough to show correctness for that particular variable.
Any thoughts on this approach? Any other ideas?
The comparison of output of the old and new applications id definitely good idea. This is sometime called back-to-back testing.
Regarding test input samples - get familiarized with following concepts:
Equivalence partitioning
Boundary-value analysis