How to add ano more measure of interest in the arules package - arules

I would like to add two additional measures as a results of the "inspect" function in arules package. They are Kulczynski and imbalance ratio.
Would you help me with info, where to find the code of inspect function and how to modify it.
Thanks

All you need to do is to add additional columns to the quality data.frame. Inspect will automatically pick those up. Here is the example from ? interestMeasure:
data("Income")
rules <- apriori(Income)
## calculate a single measure and add it to the quality slot
quality(rules) <- cbind(quality(rules),
hyperConfidence = interestMeasure(rules, method = "hyperConfidence",
transactions = Income))
inspect(head(sort(rules, by = "hyperConfidence")))
lhs rhs support confidence lift hyperConfidence
1 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 1
2 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 1
3 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 1
4 {occupation=student} => {age=14-34} 0.1592496 0.9707447 1.658345 1
5 {occupation=student} => {dual incomes=not married} 0.1535777 0.9361702 1.564683 1
6 {occupation=student} => {income=$0-$40,000} 0.1381617 0.8421986 1.353027 1

Imbalance is quite straight forward:
library(arules)
data("Income")
rules <- apriori(Income)
suppA <- support(lhs(rules), trans = Income)
suppB <- support(rhs(rules), trans = Income)
suppAB <- quality(rules)$supp
quality(rules)$imbalance <- abs(suppA - suppB)/(suppA + suppB - suppAB)
inspect(head(rules))
lhs rhs support confidence lift imbalance
1 {} => {language in home=english} 0.9128854 0.9128854 1.000000 0.03082862
2 {occupation=clerical/service} => {language in home=english} 0.1127109 0.9292566 1.017933 0.69021050
3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 0.61395923
4 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 0.35210356
5 {dual incomes=no} => {language in home=english} 0.1364165 0.9196078 1.007364 0.63837280
6 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 0.34123127
The Kulczynski measure 1/2(P(A|B)+P(B|A)) is a little more tricky. P(A|B) is just the confidence of A->B. However, for P(B|A) we need the confidence for B->A. So we need to create a new set of rules with the left-hand side and the right-hand side switched and calculate confidence:
confAB <- quality(rules)$conf
BArules <- new("rules", lhs = rhs(rules), rhs = lhs(rules))
confBA <- interestMeasure(BArules, method = "confidence", trans = Income)
quality(rules)$kulczynski <- .5*(confAB + confBA)
inspect(head(rules))
lhs rhs support confidence lift imbalance kulczynski
1 {} => {language in home=english} 0.9128854 0.9128854 1.000000 0.03082862 0.9564427
2 {occupation=clerical/service} => {language in home=english} 0.1127109 0.9292566 1.017933 0.69021050 0.5263616
3 {ethnic classification=hispanic} => {education=no college graduate} 0.1096568 0.8636884 1.224731 0.61395923 0.5095922
4 {dual incomes=no} => {marital status=married} 0.1400524 0.9441176 2.447871 0.35210356 0.6536199
5 {dual incomes=no} => {language in home=english} 0.1364165 0.9196078 1.007364 0.63837280 0.5345211
6 {occupation=student} => {marital status=single} 0.1449971 0.8838652 2.160490 0.34123127 0.6191456

Related

how to fetch data of table after doing full outer join?

i am having a business problem
having a table which have following
|sku_id |solr_status |entry_context | mrp
1 active 10 20
1 inactive 10 30
1 active 10 22.5
2 inactive 10 10
2 inactive 10 31
filter the data into
table1->active
table2->inactive
now i have to do the according to condition
full outer join on 1 and 2 on sku_id.. Three possible cases
case (1 is not null, 2 is not null) => use 1
case (1 is null, 2 is not null) => use 2
case (1 is not null, 2 is null) => use 1
my code is something like that
val activeCatalog=dataframe.filter('solr_status===true).cache()
val inActiveCatalog=dataframe.filter('solr_status===false).cache()
val fullCatalog=activeCatalog.join(inActiveCatalog,Seq("sku"),joinType = "outer")
fullCatalog.show(false)
val resultCatalog = (activeCatalog,inActiveCatalog) match {
case (activeCatalog,inActiveCatalog) if(activeCatalog.count()!=0L && inActiveCatalog.count()!=0L)=>
fullCatalog.filter('solr_status===true).cache()
case (activeCatalog, inActiveCatalog) if (activeCatalog.count() == 0L && inActiveCatalog.count() != 0L) =>
fullCatalog.filter('solr_status === false).cache()
case (activeCatalog, inActiveCatalog) if (activeCatalog.count() != 0L && inActiveCatalog.count() == 0L) =>
fullCatalog.filter('solr_status === true).cache()
}
so using the approach i am getting ambiguous column error. also my
result data set should maintain the schema for active or inactive
table, doing outer join will create duplicate columns
any help ?

Cakephp 3: view.ctp sum two variables

I have a view that shows an associated array of Revenues. I have made a collection in the controller to isolate two variables that I need to add together and display as currency.
public function view($id = null)
{
$annualOperatingBudget = $this->AnnualOperatingBudgets->get($id, [
'contain' => ['Azinstitutions', 'BudgetExpenses', 'BudgetExpenses.ExpenseTitles', 'BudgetRevenues', 'BudgetRevenues.RevenueTitles']
]);
$collection = new Collection($annualOperatingBudget->budget_revenues);
$revenuesGroup1 = $collection->match(['revenue_title.revenue_group' => 1 ]);
$revenuesGroup2 = $collection->match(['revenue_title.revenue_group' => 2 ]);
$tuitionAndFees = $collection->match(['revenue_title.revenue_title' => 'Tuition and Fees']);
$lessScholarshipAllowance = $collection->match(['revenue_title.revenue_title' => '- less Scholarship Allowance']);
$this->set(compact('annualOperatingBudget', $annualOperatingBudget,'revenuesGroup1', 'revenuesGroup2', 'tuitionAndFees', 'lessScholarshipAllowance'));
}
I am able to see the variables with the debug kit:
annualOperatingBudget (array)
revenuesGroup1 (array)
revenuesGroup2 (array)
tuitionAndFees (array)
4 (App\Model\Entity\BudgetRevenue)
id 5
annual_operating_budget_id 1
revenue 1278
revenue_title_id 5
revenue_title (array)
lessScholarshipAllowance (array)
5 (App\Model\Entity\BudgetRevenue)
id 6
annual_operating_budget_id 1
revenue -257
revenue_title_id 6
revenue_title (array)
I would like to add the two 'revenue' s together
I tried:
<?= $this->Number->currency(
($tuitionAndFees->revenue) + ($lessScholarShipAllowance->revenue),
'USD', ['places' => 1])
?>
But I get several errors:
Notice (8): Undefined property: Cake\Collection\Iterator\FilterIterator::$revenue [ROOT\plugins\Twit\src\Template\AnnualOperatingBudgets\view.ctp, line 49]
Notice (8): Undefined variable: lessScholarShipAllowance [ROOT\plugins\Twit\src\Template\AnnualOperatingBudgets\view.ctp, line 49]
Notice (8): Trying to get property of non-object [ROOT\plugins\Twit\src\Template\AnnualOperatingBudgets\view.ctp, line 49]
You have to iterate the $tuitionAndFees and the $lessScholarShipAllowance before trying to get the revenue property. Something like this:
foreach($tuitionAndFees as $tuitionAndFee){
echo $tuitionAndFee->revenue
}
If all you need in the view is the total of all tuition and fees, you can use
$tuitionAndFees = $collection
->match(['revenue_title.revenue_title' => 'Tuition and Fees'])
->sumOf('revenue');
This will return just the sum of the matched items. Do something similar for $lessScholarShipAllowance, and then in your view, simply
$this->Number->currency($tuitionAndFees + $lessScholarShipAllowance,
'USD', ['places' => 1])

Convert SQL Server query to Entity Framework query

I have a SQL Server query like this:
select
month(fact_date) as month,
sum(case when beef_dairy_stat = 1 and param_id = 1 then 1 else 0 end) as cnt
from
user_behave_fact
where
YEAR(fact_date) = 2018
group by
month(fact_date)
order by
month
with a result of
month cnt
------------
1 10
2 20
Now I need to convert this query to its corresponding Entity Framework query.
This is my current attempt:
var sql_rez_ICC = new List<Tuple<int, int>>();
sql_rez_ICC = db.user_behave_fact
.Where(x => x.fact_date.Value.Year == selected_year)
.GroupBy(y => y.fact_date.Value.Month)
.Select(y =>new { month = y.Select(x=>x.fact_date.Value.Month), icc_count = y.Count(x => x.beef_dairy_stat == true && x.param_id == 1) })
.AsEnumerable()
.Select(y => new Tuple<int, int>(y.month, y.icc_count))
.ToList();
However on second .Select, I get an error on month which is
Cannot convert from System.Collection.Generic.IEnumrable to int
y.Select(x=>x.fact_date.Value.Month) returns an IEnumerable<int>. Use y.Key instead.

Symfony/Doctrine: SUM and AVG score of players

I have in my database the tab: PLAYERS and a tab: SCORES.
In tab SCORES i have these rows: ID - IDPLAYER - SCORE
For example:
ID IDPLAYER SCORE
---------------------
1 1 5
2 2 4
3 1 3
4 2 1
5 1 9
I want put in a template this:
For "player 1" there are 3 scores.
The count of the scores is "17" (9+3+5).
The avg of the score of the player is "5.6" (17totscores / 3countScores).
I have an entity with ORM, it' ok.
I have a controller with this function:
public function avgScoreAction($id) {
$queryScore = $this->getDoctrine()
->getRepository('AcmeBundle:tabScores');
$queryAvgScore = $queryScore->createQueryBuilder('g')
->select("avg(g.score)")
->where('g.idPlayer = :idPlayer')
->setParameter('idPlayer', $id)
->getQuery();
$avgScore = $queryAvgScore->getResult();
$result = ("Score average: ".$avgScore);
return new Response($result);
But I have an error:
"Notice: Array to string conversion in this line:"
$result = ("Score average: ".$avgScore);
If I write this:
$response = new Response();
$response->setContent(json_encode(array($avgScore)));
$response->headers->set('Content-Type', 'application/json');
return $response;
I get this:
[[{"1":"5.6667"}]]
which is the correct avg, but what is: [[{"1":" and "}]] ?????
what is: [[{"1":" and "}]] ?
1 is the index of avg(g.score) in your query. To better understand why, try an echo of $queryAvgScore->getDql() before getResult().
Let's get back to the general question :
the SQL is :
SELECT AVG(SCORE) as AVG, COUNT(SCORE) as COUNT, IDPLAYER as PLAYER FROM SCORES GROUP BY IDPLAYER
and now with query builder :
$queryAvgScore = $queryScore->createQueryBuilder('g')
->select("avg(g.score) as score_avg, count(g.score) as score_count")
->where('g.idPlayer = :idPlayer')
->groupBy('g.idPlayer')
->setParameter('idPlayer', $id)
->getQuery();
Notice that i have added aliases, this is better than using indexes.
Hope it helps.
Symfony 2.6 is easy with DQL
$dql = "SELECT SUM(e.amount) AS balance FROM Bank\Entities\Entry e " .
"WHERE e.account = ?1";
$balance = $em->createQuery($dql)
->setParameter(1, $myAccountId)
->getSingleScalarResult();
Info:
http://doctrine-orm.readthedocs.org/en/latest/cookbook/aggregate-fields.html?highlight=sum

entity framework distinct by field

i have table kopija that goes like this:
idKopija | idFilm | nije_tu
1 | 1 | 0
2 | 1 | 0
3 | 1 | 1
4 | 2 | 1 and etc.
And i have query that goes like this:
var upit = from f in baza.films
join z in baza.zanrs on f.idZanr equals z.idZanr
join k in baza.kopijas on f.idFilm equals k.idFilm
select new
{
idFilm = f.idFilm,
nazivFilm = f.naziv,
nazivZanr = z.naziv,
idZanr = f.idZanr,
godina = f.godina,
slika = f.slika,
klip = f.klip,
nijeTu = k.nije_tu
};
if (checkBox1.Checked)
upit = upit.Where(k => k.nijeTu == 0).Distinct();
else
{
upit = upit.Where(k => k.nijeTu == 0 || k.nijeTu == 1).Distinct();
}
Now i want to make a distinct list of "idFilm". But prolem is that I get idFilm on two places because one of them has nije_tu=0 and other one has nije_tu=1.
Please someone help me.
Thank you.
What about
upit.Where(k => k.nijeTu == 0 || k.nijeTu == 1).Select(x => x.idFilm).Distinct();
?