Proguard keep public classes, fields and methods - kotlin

I use Kotlin and I have many internal classes.
I want to obfuscate and shrink everything apart from all public classes.
Proguard rules:
-dontusemixedcaseclassnames
-dontskipnonpubliclibraryclasses
-verbose
-optimizationpasses 5
-overloadaggressively
-repackageclasses ''
-allowaccessmodification
-keep public class * {
public <methods>;
public <fields>;
}
Unfortunately the -keep public class * behaves defensively and keeps all names, also for the internal classes.

Your rules are way too broad.
Single "-keep" with nested rules is more broad, compared to combination of "-keepmembers" and "-keepclasseswithmembers" rules
Full "-keep" rule means "do not change bytecode of that method, ever"
Classes, referenced by kept classes and methods, can not be removed, renamed or repackaged
This line in your rules keeps all your classes and interfaces:
-keep public class * {
I mean ALL of them. Whether they have public members or not.
Use -keepclasseswithmembers instead!
Because of these lines
{
public <methods>;
public <fields>;
}
all your public methods will be left untouched, which means that repackaging and renaming methods, referenced from your public methods can not be carried out!
If you want at least some repackaging to be done, make sure to allow optimization (because repackaging is performed as part of optimization step):
-keepmembers,allowoptimization public class * {
public <methods>;
public <fields>;
}
In addition to repackaging, this will also allow for some inlining (which in turn assists in removing classes, that supply inlined methods).
Also with Android apps you are much better off repackaging into your primary package (the application package, or package with biggest number of your immovable classes in it) instead of empty package (''). This is because some "exported" classes (Activities, Views, Services, other stuff, referenced from xml files) can not be moved outside of their package by Proguard, — aapt dynamically generates special rules to prevent that. The part of optimization process, that changes access modes from public to protected/private, becomes more efficient the more classes can be placed together in single package.
I want to obfuscate and shrink everything apart from all public classes.
Bad idea. You really should try to obfuscate as much as possible, especially public classes. If you restrict obfuscation, repackaging is also restricted! It would rename them!!
Aim for the most specific rules possible.
If you want to prevent shrinking:
-keep,allowoptimization,allowobfuscation public class com.example.Example
If you want to prevent renaming, but allow stripping unused classes:
-keep,allowoptimization,allowshrinking public class com.example.*
In general, avoid wildcard rules (bare *) and -keep rules: prefer rules for specific classes and -keepmembers/-keepclasseswithmembers
The correct approaches for obfuscating applications and libraries are completely different, but they have something in common — you should not care about public methods/classes; just obfuscate/shrink/repackage as much as possible until any more would break it.
For applications you should just obfuscate/repackage as much as possible. If you don't know, which packages are safe to obfuscate, start from opting known safe packages into obfuscation.
For libraries — do not apply Proguard to library itself (unless you are trying to achieve security by obscurity). Use the feature of aar format — consumer proguard files — that allows to supply rule "segments", which should be followed during final app obfuscation.

Related

Proguard: -keepparameternames for certain packages only

The option -keep allows to exclude from obfuscation, but it still obfuscate the methods parameter names, which can be bad for framework like Spring web.
-keep class com.example.web.** { *; }
Is there a way to preserve the arguments names for certain packages only?
Not possible:
https://sourceforge.net/p/proguard/discussion/182455/thread/59cb6762/
~~~~~~~~~~~~~~~~
From what I've tried the -keepparameternames seems to affect only the -keep-ed methods.
So the answer to your question: it's possible to limit the list of packages that fall under its action marking with -keep (or its derivatives) only certain packages.

With ProGuard, how do I obfuscate just one class?

What would be a smart ProGuard configuration to obfuscate just the private methods and constants of one particular class com.acme.Algorithm?
I would like to obfuscate just that, because it contains an algorithm that should not be plain obvious when accidentally opening the .jar.
I'm a ProGuard newbie. AFAIU, you have to use "keep", but the positive logic of "do obfuscate" is not available, right? So how to exlude my class from a "keep everything" config? Note: I don't want to obfuscate other classes for the moment, because I want to allow the customer to see meaningful stacktraces.
Obfuscating a single class won't have much effect: it may change the class name and a few field names and methods names, and it may optimize some code. Obfuscation tends to be less effective for hiding small pieces of information. The more application code you obfuscate, the more difficult it becomes to understand.
That being said, you can specify:
-keep class !com.acme.Algorithm { *; }
It keeps all classes/fields/methods outside of com.acme.Algorithm.

How do I tell proguard to not do anything except some stripping

I'm using progurad to get rid of some logging:
-assumenosideeffects class android.util.Log {
public static int d(...);
public static int v(...);
}
I don't want anything else to happen to any classes. In particular I don't want any obfuscation, since this is a library. The clients of the library will apply obfuscation themselfs.
Is there a way to tell proguard to do "nothing" except the -assumenosideeffects rule please?
This option is applied in the optimization step, so you could disable shrinking and obfuscation. You still need to provide -keep options, e.g. ProGuard manual > Examples > A typical library.

Exclude some classes from Proguard's keep rules

I have a library that is about to be obfuscated using ProGuard. "Library mode" is almost applicable for my use case, i.e. it is almost fine to keep all public and protected classes and class members.
However due to Java's visibility requirements some members cannot be made package private or private and thus they are public classes although they should not be in the library. I would like to have them obfuscated to make it more clearly that these classes do not belong to the public api, as well as to get better obfuscation and smaller library jars.
Is there a way to exclude some items from a proguard "keep" rule without specifying each of these items by name (using the '!').
Ideally I would like to annotate these classes and members with a tagging annotation, but as far as I understand Proguard can only be told to keep items with certain annotations.
You can only keep items indeed. If you want to exclude certain class members, you have to do so by listing or annotating the class members that you do want to keep. When specifying a class name, you can provide a list, optionally with "!" to exclude names. When specifying a class member name and type, that is not possible. Still, in both cases, you can use wildcards. If you pick special names for your internal classes, this might work:
-keep public class * {
public protected *** !myInternalField*;
public protected *** !myInternalMethod*(...);
}

In what namespace should you put interfaces relative to their implementors?

Specifically, when you create an interface/implementor pair, and there is no overriding organizational concern (such as the interface should go in a different assembly ie, as recommended by the s# architecture) do you have a default way of organizing them in your namespace/naming scheme?
This is obviously a more opinion based question but I think some people have thought about this more and we can all benefit from their conclusions.
The answer depends on your intentions.
If you intend the consumer of your namespaces to use the interfaces over the concrete implementations, I would recommend having your interfaces in the top-level namespace with the implementations in a child namespace
If the consumer is to use both, have them in the same namespace.
If the interface is for predominantly specialized use, like creating new implementations, consider having them in a child namespace such as Design or ComponentModel.
I'm sure there are other options as well, but as with most namespace issues, it comes down to the use-cases of the project, and the classes and interfaces it contains.
I usually keep the interface in the same namespace of as the concrete types.
But, that's just my opinion, and namespace layout is highly subjective.
Animals
|
| - IAnimal
| - Dog
| - Cat
Plants
|
| - IPlant
| - Cactus
You don't really gain anything by moving one or two types out of the main namespace, but you do add the requirement for one extra using statement.
What I generally do is to create an Interfaces namespace at a high level in my hierarchy and put all interfaces in there (I do not bother to nest other namespaces in there as I would then end up with many namespaces containing only one interface).
Interfaces
|--IAnimal
|--IVegetable
|--IMineral
MineralImplementor
Organisms
|--AnimalImplementor
|--VegetableImplementor
This is just the way that I have done it in the past and I have not had many problems with it, though admittedly it might be confusing to others sitting down with my projects. I am very curious to see what other people do.
I prefer to keep my interfaces and implementation classes in the same namespace. When possible, I give the implementation classes internal visibility and provide a factory (usually in the form of a static factory method that delegates to a worker class, with an internal method that allows a unit tests in a friend assembly to substitute a different worker that produces stubs). Of course, if the concrete class needs to be public--for instance, if it's an abstract base class, then that's fine; I don't see any reason to put an ABC in its own namespace.
On a side note, I strongly dislike the .NET convention of prefacing interface names with the letter 'I.' The thing the (I)Foo interface models is not an ifoo, it's simply a foo. So why can't I just call it Foo? I then name the implementation classes specifically, for example, AbstractFoo, MemoryOptimizedFoo, SimpleFoo, StubFoo etc.
(.Net) I tend to keep interfaces in a separate "common" assembly so I can use that interface in several applications and, more often, in the server components of my apps.
Regarding namespaces, I keep them in BusinessCommon.Interfaces.
I do this to ensure that neither I nor my developers are tempted to reference the implementations directly.
Separate the interfaces in some way (projects in Eclipse, etc) so that it's easy to deploy only the interfaces. This allows you to provide your external API without providing implementations. This allows dependent projects to build with a bare minimum of externals. Obviously this applies more to larger projects, but the concept is good in all cases.
I usually separate them into two separate assemblies. One of the usual reasons for a interface is to have a series of objects look the same to some subsystem of your software. For example I have all my Reports implementing the IReport Interfaces. IReport is used is not only used in printing but for previewing and selecting individual options for each report. Finally I have a collection of IReport to use in dialog where the user selects which reports (and configuring options) they want to print.
The Reports reside in a separate assembly and the IReport, the Preview engine, print engine, report selections reside in their respective core assembly and/or UI assembly.
If you use the Factory Class to return a list of available reports in the report assembly then updating the software with new report becomes merely a matter of copying the new report assembly over the original. You can even use the Reflection API to just scan the list of assemblies for any Report Factories and build your list of Reports that way.
You can apply this techniques to Files as well. My own software runs a metal cutting machine so we use this idea for the shape and fitting libraries we sell alongside our software.
Again the classes implementing a core interface should reside in a separate assembly so you can update that separately from the rest of the software.
I give my own experience that is against other answers.
I tend to put all my interfaces in the package they belongs to. This grants that, if I move a package in another project I have all the thing there must be to run the package without any changes.
For me, any helper functions and operator functions that are part of the functionality of a class should go into the same namespace as that of the class, because they form part of the public API of that namespace.
If you have common implementations that share the same interface in different packages you probably need to refactor your project.
Sometimes I see that there are plenty of interfaces in a project that could be converted in an abstract implementation rather that an interface.
So, ask yourself if you are really modeling a type or a structure.
A good example might be looking at what Microsoft does.
Assembly: System.Runtime.dll
System.Collections.Generic.IEnumerable<T>
Where are the concrete types?
Assembly: System.Colleections.dll
System.Collections.Generic.List<T>
System.Collections.Generic.Queue<T>
System.Collections.Generic.Stack<T>
// etc
Assembly: EntityFramework.dll
System.Data.Entity.IDbSet<T>
Concrete Type?
Assembly: EntityFramework.dll
System.Data.Entity.DbSet<T>
Further examples
Microsoft.Extensions.Logging.ILogger<T>
- Microsoft.Extensions.Logging.Logger<T>
Microsoft.Extensions.Options.IOptions<T>
- Microsoft.Extensions.Options.OptionsManager<T>
- Microsoft.Extensions.Options.OptionsWrapper<T>
- Microsoft.Extensions.Caching.Memory.MemoryCacheOptions
- Microsoft.Extensions.Caching.SqlServer.SqlServerCacheOptions
- Microsoft.Extensions.Caching.Redis.RedisCacheOptions
Some very interesting tells here. When the namespace changes to support the interface, the namespace change Caching is also prefixed to the derived type RedisCacheOptions. Additionally, the derived types are in an additional namespace of the implementation.
Memory -> MemoryCacheOptions
SqlServer -> SqlServerCatchOptions
Redis -> RedisCacheOptions
This seems like a fairly easy pattern to follow most of the time. As an example I (since no example was given) the following pattern might emerge:
CarDealership.Entities.Dll
CarDealership.Entities.IPerson
CarDealership.Entities.IVehicle
CarDealership.Entities.Person
CarDealership.Entities.Vehicle
Maybe a technology like Entity Framework prevents you from using the predefined classes. Thus we make our own.
CarDealership.Entities.EntityFramework.Dll
CarDealership.Entities.EntityFramework.Person
CarDealership.Entities.EntityFramework.Vehicle
CarDealership.Entities.EntityFramework.SalesPerson
CarDealership.Entities.EntityFramework.FinancePerson
CarDealership.Entities.EntityFramework.LotVehicle
CarDealership.Entities.EntityFramework.ShuttleVehicle
CarDealership.Entities.EntityFramework.BorrowVehicle
Not that it happens often but may there's a decision to switch technologies for whatever reason and now we have...
CarDealership.Entities.Dapper.Dll
CarDealership.Entities.Dapper.Person
CarDealership.Entities.Dapper.Vehicle
//etc
As long as we're programming to the interfaces we've defined in root Entities (following the Liskov Substitution Principle) down stream code doesn't care where how the Interface was implemented.
More importantly, In My Opinion, creating derived types also means you don't have to consistently include a different namespace because the parent namespace contains the interfaces. I'm not sure I've ever seen a Microsoft example of interfaces stored in child namespaces that are then implement in the parent namespace (almost an Anti-Pattern if you ask me).
I definitely don't recommend segregating your code by type, eg:
MyNamespace.Interfaces
MyNamespace.Enums
MyNameSpace.Classes
MyNamespace.Structs
This doesn't add value to being descriptive. And it's akin to using System Hungarian notation, which is mostly if not now exclusively, frowned upon.
I HATE when I find interfaces and implementations in the same namespace/assembly. Please don't do that, if the project evolves, it's a pain in the ass to refactor.
When I reference an interface, I want to implement it, not to get all its implementations.
What might me be admissible is to put the interface with its dependency class(class that references the interface).
EDIT: #Josh, I juste read the last sentence of mine, it's confusing! of course, both the dependency class and the one that implements it reference the interface. In order to make myself clear I'll give examples :
Acceptable :
Interface + implementation :
namespace A;
Interface IMyInterface
{
void MyMethod();
}
namespace A;
Interface MyDependentClass
{
private IMyInterface inject;
public MyDependentClass(IMyInterface inject)
{
this.inject = inject;
}
public void DoJob()
{
//Bla bla
inject.MyMethod();
}
}
Implementing class:
namespace B;
Interface MyImplementing : IMyInterface
{
public void MyMethod()
{
Console.WriteLine("hello world");
}
}
NOT ACCEPTABLE:
namespace A;
Interface IMyInterface
{
void MyMethod();
}
namespace A;
Interface MyImplementing : IMyInterface
{
public void MyMethod()
{
Console.WriteLine("hello world");
}
}
And please DON'T CREATE a project/garbage for your interfaces ! example : ShittyProject.Interfaces. You've missed the point!
Imagine you created a DLL reserved for your interfaces (200 MB). If you had to add a single interface with two line of codes, your users will have to update 200 MB just for two dumb signaturs!