My Android malware dataset is based on ContagioDump. The dataset is a collection of Android based malware seen in the wild. I specifically avoided self-created malware samples. The malware pieces were downloaded on October 26th, 2011. The total number of malware included in the sample is 189. I have qualitatively split them into categories based on their primary behaviours where available. I obtained their primary behaviours from malware reports from the various AV companies.If the malware would download a separate payload as its primary function, it was put in the Trojan category. If the malware executed an escalation of privilege attack, it was in the escalation of privilege category. If the malware primarily stole data from the phone, it was classified as information stealing. If the malware sent premium SMS messages, it was a premium SMS transmitting malware.
Download link: AndroidMalwareDataSet.tar.bz2
If you use this dataset I would very much appreciate an acknowledgement/citation for both ContagioDump and myself (with a stress on Contagio).
You can find an analysis of the malware permissions used in this dataset here. NOTE: This analysis does not take into account developers requesting a permission more than once. Consider it a rough estimate. An analysis that takes into account potential duplicate permission requests in a manifest can be found here. The former is a libre office file, the latter is an excel file.