How to parse the AndroidManifest xml file inside an apk package

Question

This file appears to be in a binary XML format   What is this format and how can it be parsed programmatically  as opposed to using the aapt dump tool in the SDK      This binary format is not discussed in the documentation here   Note  I want to access this information from outside the Android environment  preferably from Java

User · Answer

With the latest SDK-Tools  you can now use a tool called the apkanalyzer to print out the AndroidManifest xml of an APK  as well as other parts  such as resources     android sdk  tools bin apkanalyzer manifest print  app apk   apkanalyzer

User · Answer

What about using the Android Asset Packaging Tool  aapt   from the Android SDK  into a Python  or whatever  script   Through the aapt  http   elinux org Android aapt   indeed  you can retrieve information about the  apk package and about its AndroidManifest xml file  In particular  you can extract the values of individual elements of an  apk package through the  dump  sub-command  For example  you can extract the user-permissions in the AndroidManifest xml file inside an  apk package in this way     aapt dump permissions package apk   Where package apk is your  apk package   Moreover  you can use the Unix pipe command to clear the output  For example     aapt dump permissions package apk   sed 1d   awk    print  NF      Here a Python script that to that programmatically   import os import subprocess   Current directory and file name  curpath   os path dirname  os path realpath   file      filepath   os path join curpath   package apk     Extract the AndroidManifest xml permissions  command    aapt dump permissions     filepath       sed 1d   awk    print  NF     process   subprocess Popen command  stdout subprocess PIPE  stderr None  shell True  permissions   process communicate   0   print permissions   In a similar fashion you can extract other information  e g  package  app name  etc     of the AndroidManifest xml    Extract the APK package info  shellcommand    aapt dump badging     filepath process   subprocess Popen shellcommand  stdout subprocess PIPE  stderr None  shell True  apkInfo   process communicate   0  splitlines    for info in apkInfo       Package info      if string find info   package    0     -1          print  App Package      findBetween info   name                  print  App Version      findBetween info   versionName                  continue       App name      if string find info   application    0     -1          print  App Name      findBetween info   label                  continue   def findBetween s  prefix  suffix       try          start   s index prefix    len prefix          end   s index suffix  start          return s start end      except ValueError          return      If instead you want to parse the entire AndroidManifest XML tree  you can do that in a similar way using the xmltree command   aapt dump xmltree package apk AndroidManifest xml   Using Python as before    Extract the AndroidManifest XML tree  shellcommand    aapt dump xmltree     filepath     AndroidManifest xml  process   subprocess Popen shellcommand  stdout subprocess PIPE  stderr None  shell True  xmlTree   process communicate   0   print  Number of Activities      str xmlTree count  activity    print  Number of Services      str xmlTree count  service    print  Number of BroadcastReceivers      str xmlTree count  receiver

User · Answer

apk-parser  https   github com caoqianli apk-parser  a lightweight impl for java  with no dependency for aapt or other binarys  is good for parse binary xml files  and other apk infos   ApkParser apkParser   new ApkParser new File filePath       set a locale to translate resource tag into specific strings in language the locale specified  you set locale to Locale ENGLISH then get apk title  WeChat  instead of   string app name  for example apkParser setPreferredLocale locale    String xml   apkParser getManifestXml    System out println xml    String xml2   apkParser transBinaryXml xmlPathInApk   System out println xml2    ApkMeta apkMeta   apkParser getApkMeta    System out println apkMeta    Set lt Locale gt  locales   apkParser getLocales    for  Locale l   locales        System out println l     apkParser close

User · Answer

I have been running with the Ribo code posted above for over a year  and it has served us well  With recent updates  Gradle 3 x  though  I was no longer able to parse the AndroidManifest xml  I was getting index out of bounds errors  and in general it was no longer able to parse the file    Update  I now believe that our issues was with upgrading to Gradle 3 x  This article describes how AirWatch had issues and can be fixed by using a Gradle setting to use aapt instead of aapt2 AirWatch seems to be incompatible with Android Plugin for Gradle 3 0 0-beta1  In searching around I came across this open source project  and it s being maintained and I was able to get to the point and read both my old APKs that I could previously parse  and the new APKs that the logic from Ribo threw exceptions  https   github com xgouchet AXML  From his example this is what I m doing    zf   new ZipFile apkFile        Getting the manifest   ZipEntry entry   zf getEntry  AndroidManifest xml      InputStream is   zf getInputStream entry            Read our manifest Document      Document manifestDoc   new CompressedXmlParser   parseDOM is            Make sure we got a doc  and that it has children      if  null    manifestDoc  amp  amp  manifestDoc getChildNodes   getLength    gt  0                       Node firstNode   manifestDoc getFirstChild                Now get the attributes out of the node         NamedNodeMap nodeMap   firstNode getAttributes                Finally to a point where we can read out our values         versionName   nodeMap getNamedItem  android versionName   getNodeValue            versionCode   nodeMap getNamedItem  android versionCode   getNodeValue

User · Answer

You can use axml2xml pl tool developed a while ago within android-random project  It will generate the textual manifest file  AndroidManifest xml  from the binary one    I m saying  textual  and not  original  because like many reverse-engineering tools this one isn t perfect and the result will not be complete  I presume either it was never feature complete or simply not forward-compatible  with newer binary encoding scheme   Whatever the reason  axml2xml pl tool will not be able to extract all the attribute values correctly  Such attributes are minSdkVersion  targetSdkVersion and basically all attributes that are referencing resources  like strings  icons  etc    i e  only class names  of activities  services  etc   are extracted correctly   However  you can still find these missing information by running aapt tool on the original Android app file   apk       aapt l -a  lt someapp apk gt

User · Answer

Use android-apktool There is an application that reads apk files and decodes XMLs to nearly original form  Usage  apktool d Gmail apk  amp  amp  cat Gmail AndroidManifest xml  Check android-apktool for more information

User · Answer

In Android studio 2 2 you can directly analyze the apk  Goto build- analyze apk  Select the apk  navigate to androidmanifest xml  You can see the details of androidmanifest

User · Answer

Check this following WPF Project which decodes the properties correctly

User · Answer

apkanalyzer will be helpful   echo off                                                                                              apkanalyzer start up script for Windows            converted by ewwink                                                                                          Attempt to set APP HOME  SET SAVED  cd  SET APP HOME C  android sdk tools SET APP NAME  apkanalyzer     Add default JVM options here  You can also use JAVA OPTS and APKANALYZER OPTS to pass JVM options to this script  SET DEFAULT JVM OPTS -Dcom android sdklib toolsdir  APP HOME   SET CLASSPATH  APP HOME  lib dvlib-26 0 0-dev jar  APP HOME  lib util-2 2 1 jar  APP HOME  lib jimfs-1 1 jar  APP HOME  lib annotations-13 0 jar  APP HOME  lib ddmlib-26 0 0-dev jar  APP HOME  lib repository-26 0 0-dev jar  APP HOME  lib sdk-common-26 0 0-dev jar  APP HOME  lib kotlin-stdlib-1 1 3-2 jar  APP HOME  lib protobuf-java-3 0 0 jar  APP HOME  lib apkanalyzer-cli jar  APP HOME  lib gson-2 3 jar  APP HOME  lib httpcore-4 2 5 jar  APP HOME  lib dexlib2-2 2 1 jar  APP HOME  lib commons-compress-1 12 jar  APP HOME  lib generator jar  APP HOME  lib error prone annotations-2 0 18 jar  APP HOME  lib commons-codec-1 6 jar  APP HOME  lib kxml2-2 3 0 jar  APP HOME  lib httpmime-4 1 jar  APP HOME  lib annotations-12 0 jar  APP HOME  lib bcpkix-jdk15on-1 56 jar  APP HOME  lib jsr305-3 0 0 jar  APP HOME  lib explainer jar  APP HOME  lib builder-model-3 0 0-dev jar  APP HOME  lib baksmali-2 2 1 jar  APP HOME  lib j2objc-annotations-1 1 jar  APP HOME  lib layoutlib-api-26 0 0-dev jar  APP HOME  lib jcommander-1 64 jar  APP HOME  lib commons-logging-1 1 1 jar  APP HOME  lib annotations-26 0 0-dev jar  APP HOME  lib builder-test-api-3 0 0-dev jar  APP HOME  lib animal-sniffer-annotations-1 14 jar  APP HOME  lib bcprov-jdk15on-1 56 jar  APP HOME  lib httpclient-4 2 6 jar  APP HOME  lib common-26 0 0-dev jar  APP HOME  lib jopt-simple-4 9 jar  APP HOME  lib sdklib-26 0 0-dev jar  APP HOME  lib apkanalyzer jar  APP HOME  lib shared jar  APP HOME  lib binary-resources jar  APP HOME  lib guava-22 0 jar  SET APP ARGS      Collect all arguments for the java command  following the shell quoting and substitution rules SET APKANALYZER OPTS  DEFAULT JVM OPTS  -classpath  CLASSPATH  com android tools apk analyzer ApkAnalyzerCli  APP ARGS     Determine the Java command to use to start the JVM  SET JAVACMD  java  where  JAVACMD   gt nul 2 gt nul if  errorlevel   1     echo ERROR   java  command could be found in your PATH    echo Please set the  java  variable in your environment to match the   echo location of your Java installation    echo    exit  b 0       execute apkanalyzer   JAVACMD   APKANALYZER OPTS    original post https   stackoverflow com a 51905063 1383521

User · Answer

I found the AXMLPrinter2  a Java app over at the Android4Me project to work fine on the AndroidManifest xml that I had  and prints the XML out in a nicely formatted way   http   code google com p android4me downloads detail name AXMLPrinter2 jar  One note   it  and the code on this answer from Ribo  doesn t appear to handle every compiled XML file that I ve come across  I found one where the strings were stored with one byte per character  rather than the double byte format that it assumes

User · Answer

Mathieu Kotlin version follows    fun main args   Array lt String gt         val fileName    app apk      ZipFile fileName  use   zip - gt          zip entries   asSequence   forEach   entry - gt              if entry name     AndroidManifest xml                     zip getInputStream entry  use   input - gt                      val xml   decompressXML input readBytes                          TODO  parse the XML                     println xml                                                                     Binary XML doc ending Tag             var endDocTag   0x00100101                 Binary XML start Tag             var startTag   0x00100102                 Binary XML end Tag             var endTag   0x00100103                  Reference var for spacing        Used in prtIndent               var spaces                                                                    Parse the  compressed  binary form of Android XML docs        such as for AndroidManifest xml in  apk files        Source  http   stackoverflow com questions 2097813 how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package 4761689 4761689                param xml Encoded XML content to decompress             fun decompressXML xml  ByteArray   String            val resultXml   StringBuilder               Compressed XML file bytes starts with 24x bytes of data             9 32 bit words in little endian order  LSB first                0th word is 03 00 08 00              3rd word SEEMS TO BE   Offset at then of StringTable              4th word is  Number of strings in string table            WARNING  Sometime I indiscriminently display or refer to word in              little endian storage format  or in integer format  ie MSB first           val numbStrings   LEW xml  4   4              StringIndexTable starts at offset 24x  an array of 32 bit LE offsets            of the length string data in the StringTable          val sitOff   0x24     Offset of start of StringIndexTable             StringTable  each string is represented with a 16 bit little endian            character count  followed by that number of 16 bit  LE   Unicode  chars          val stOff   sitOff   numbStrings   4     StringTable follows StrIndexTable             XMLTags  The XML tag tree starts after some unknown content after the            StringTable   There is some unknown data after the StringTable  scan            forward from this point to the flag for the start of an XML start tag          var xmlTagOff   LEW xml  3   4      Start from the offset in the 3rd word             Scan forward until we find the bytes  0x02011000 x00100102 in normal int          run               var ii   xmlTagOff             while  ii  lt  xml size - 4                    if  LEW xml  ii     startTag                        xmlTagOff   ii                     break                                   ii    4                            end of hack  scanning for start of first start tag             XML tags and attributes             Every XML start and end tag consists of 6 32 bit words               0th word  02011000 for startTag and 03011000 for endTag              1st word  a flag   like 38000000              2nd word  Line of where this tag appeared in the original source file              3rd word  FFFFFFFF                 4th word  StringIndex of NameSpace name  or FFFFFFFF for default NS              5th word  StringIndex of Element Name               Note  01011000 in 0th word means end of XML document  endDocTag              Start tags  not end tags  contain 3 more words               6th word  14001400 meaning                7th word  Number of Attributes that follow this tag follow word 8th               8th word  00000000 meaning               Attributes consist of 5 words               0th word  StringIndex of Attribute Name s Namespace  or FFFFFFFF              1st word  StringIndex of Attribute Name              2nd word  StringIndex of Attribute Value  or FFFFFFF if ResourceId used              3rd word  Flags               4th word  str ind of attr value again  or ResourceId of value             TMP  dump string table to tr for debugging           tr addSelect  strings   null             for  int ii 0  ii lt numbStrings  ii                     Length of string starts at StringTable plus offset in StrIndTable             String str   compXmlString xml  sitOff  stOff  ii               tr add String valueOf ii   str                         tr parent                Step through the XML tree element tags and attributes         var off   xmlTagOff         var indent   0         var startTagLineNo   -2         while  off  lt  xml size                val tag0   LEW xml  off                int tag1   LEW xml  off 1 4               val lineNo   LEW xml  off   2   4                int tag3   LEW xml  off 3 4               val nameNsSi   LEW xml  off   4   4              val nameSi   LEW xml  off   5   4               if  tag0    startTag       XML START TAG                 val tag6   LEW xml  off   6   4      Expected to be 14001400                 val numbAttrs   LEW xml  off   7   4      Number of Attributes to follow                   int tag8   LEW xml  off 8 4       Expected to be 00000000                 off    9   4     Skip over 6 3 words of startTag data                 val name   compXmlString xml  sitOff  stOff  nameSi                    tr addSelect name  null                   startTagLineNo   lineNo                     Look for the Attributes                 val sb   StringBuffer                   for  ii in 0 until numbAttrs                        val attrNameNsSi   LEW xml  off      AttrName Namespace Str Ind  or FFFFFFFF                     val attrNameSi   LEW xml  off   1   4      AttrName String Index                     val attrValueSi   LEW xml  off   2   4     AttrValue Str Ind  or FFFFFFFF                     val attrFlags   LEW xml  off   3   4                      val attrResId   LEW xml  off   4   4      AttrValue ResourceId or dup AttrValue StrInd                     off    5   4     Skip over the 5 words of an attribute                      val attrName   compXmlString xml  sitOff  stOff  attrNameSi                      val attrValue   if  attrValueSi    -1                          compXmlString xml  sitOff  stOff  attrValueSi                      else                          resourceID 0x    Integer toHexString attrResId                      sb append    attrName    attrValue                           tr add attrName  attrValue                                     resultXml append prtIndent indent    lt  name sb gt                     indent                  else if  tag0    endTag       XML END TAG                 indent--                 off    6   4     Skip over 6 words of endTag data                 val name   compXmlString xml  sitOff  stOff  nameSi                  resultXml append prtIndent indent    lt   name gt    line  startTagLineNo- lineNo                       tr parent        Step back up the NobTree                else if  tag0    endDocTag        END OF XML DOC TAG                 break                else                           println    Unrecognized tag code      Integer toHexString tag0                                   at offset     off                                   break                            end of while loop scanning tags and attributes of XML tree         println      end at offset  off            return resultXml toString            end of decompressXML                  Tool Method for decompressXML           Compute binary XML to its string format        Source  Source  http   stackoverflow com questions 2097813 how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package 4761689 4761689                param xml Binary-formatted XML         param sitOff         param stOff         param strInd         return String-formatted XML             fun compXmlString xml  ByteArray  sitOff  Int  stOff  Int  strInd  Int   String            if  strInd  lt  0  return null         val strOff   stOff   LEW xml  sitOff   strInd   4          return compXmlStringAt xml  strOff                         Tool Method for decompressXML           Apply indentation                param indent Indentation level         param str String to indent         return Indented string             fun prtIndent indent  Int  str  String   String            return spaces substring 0  Math min indent   2  spaces length     str                        Tool method for decompressXML          Return the string stored in StringTable format at        offset strOff   This offset points to the 16 bit string length  which        is followed by that number of 16 bit  Unicode  chars                 param arr StringTable array         param strOff Offset to get string from         return String from StringTable at offset strOff             fun compXmlStringAt arr  ByteArray  strOff  Int   String           val strLen    arr strOff   1  shl  8 and 0xff00   or  arr strOff  toInt   and 0xff          val chars   ByteArray strLen          for  ii in 0 until strLen                chars ii    arr strOff   2   ii   2                    return String chars      Hack  just use 8 byte chars          end of compXmlStringAt                  Return value of a Little Endian 32 bit word from the byte array        at offset off                 param arr Byte array with 32 bit word         param off Offset to get word from         return Value of Little Endian 32 bit word specified             fun LEW arr  ByteArray  off  Int   Int           return  arr off   3  shl 24 and -0x1000000 or   arr off   2  shl 16  and 0xff0000                  or  arr off   1  shl 8 and 0xff00  or  arr off  toInt   and 0xFF            end of LEW      private infix fun Byte shl i  Int   Int    this toInt   shl i      private infix fun Int shl i  Int   Int    this shl i    This is a kotlin version of the answer above

User · Answer

for reference here is my version of Ribo s code  The main difference is that decompressXML   directly returns a String  which for my purposes was a more appropriate usage    NOTE  my sole purpose in using Ribo s solution was to fetch an  APK file s published version from the Manifest XML file  and I confirm that for this purpose it works beautifully   EDIT  2013-03-16   It works beautifully IF the version is set as plain text  but if it s set to refer to a Resource XML  it ll show up as  Resource 0x1  for example  In this particular case  you ll probably have to couple this solution to another solution that will fetch the proper string resource reference          Binary XML doc ending Tag     public static int endDocTag   0x00100101          Binary XML start Tag     public static int startTag    0x00100102          Binary XML end Tag     public static int endTag      0x00100103           Reference var for spacing    Used in prtIndent       public static String spaces                                                             Parse the  compressed  binary form of Android XML docs     such as for AndroidManifest xml in  apk files    Source  http   stackoverflow com questions 2097813 how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package 4761689 4761689         param xml Encoded XML content to decompress     public static String decompressXML byte   xml         StringBuilder resultXml   new StringBuilder            Compressed XML file bytes starts with 24x bytes of data         9 32 bit words in little endian order  LSB first            0th word is 03 00 08 00          3rd word SEEMS TO BE   Offset at then of StringTable          4th word is  Number of strings in string table        WARNING  Sometime I indiscriminently display or refer to word in           little endian storage format  or in integer format  ie MSB first       int numbStrings   LEW xml  4 4           StringIndexTable starts at offset 24x  an array of 32 bit LE offsets        of the length string data in the StringTable      int sitOff   0x24      Offset of start of StringIndexTable         StringTable  each string is represented with a 16 bit little endian         character count  followed by that number of 16 bit  LE   Unicode  chars      int stOff   sitOff   numbStrings 4      StringTable follows StrIndexTable         XMLTags  The XML tag tree starts after some unknown content after the        StringTable   There is some unknown data after the StringTable  scan        forward from this point to the flag for the start of an XML start tag      int xmlTagOff   LEW xml  3 4       Start from the offset in the 3rd word         Scan forward until we find the bytes  0x02011000 x00100102 in normal int      for  int ii xmlTagOff  ii lt xml length-4  ii  4          if  LEW xml  ii     startTag             xmlTagOff   ii   break                   end of hack  scanning for start of first start tag         XML tags and attributes         Every XML start and end tag consists of 6 32 bit words           0th word  02011000 for startTag and 03011000 for endTag           1st word  a flag   like 38000000          2nd word  Line of where this tag appeared in the original source file          3rd word  FFFFFFFF             4th word  StringIndex of NameSpace name  or FFFFFFFF for default NS          5th word  StringIndex of Element Name           Note  01011000 in 0th word means end of XML document  endDocTag          Start tags  not end tags  contain 3 more words           6th word  14001400 meaning             7th word  Number of Attributes that follow this tag follow word 8th           8th word  00000000 meaning           Attributes consist of 5 words            0th word  StringIndex of Attribute Name s Namespace  or FFFFFFFF          1st word  StringIndex of Attribute Name          2nd word  StringIndex of Attribute Value  or FFFFFFF if ResourceId used          3rd word  Flags           4th word  str ind of attr value again  or ResourceId of value         TMP  dump string table to tr for debugging       tr addSelect  strings   null         for  int ii 0  ii lt numbStrings  ii                 Length of string starts at StringTable plus offset in StrIndTable         String str   compXmlString xml  sitOff  stOff  ii           tr add String valueOf ii   str                 tr parent            Step through the XML tree element tags and attributes     int off   xmlTagOff      int indent   0      int startTagLineNo   -2      while  off  lt  xml length          int tag0   LEW xml  off           int tag1   LEW xml  off 1 4         int lineNo   LEW xml  off 2 4           int tag3   LEW xml  off 3 4         int nameNsSi   LEW xml  off 4 4         int nameSi   LEW xml  off 5 4          if  tag0    startTag       XML START TAG         int tag6   LEW xml  off 6 4       Expected to be 14001400         int numbAttrs   LEW xml  off 7 4       Number of Attributes to follow           int tag8   LEW xml  off 8 4       Expected to be 00000000         off    9 4      Skip over 6 3 words of startTag data         String name   compXmlString xml  sitOff  stOff  nameSi             tr addSelect name  null           startTagLineNo   lineNo              Look for the Attributes         StringBuffer sb   new StringBuffer            for  int ii 0  ii lt numbAttrs  ii                int attrNameNsSi   LEW xml  off       AttrName Namespace Str Ind  or FFFFFFFF           int attrNameSi   LEW xml  off 1 4       AttrName String Index           int attrValueSi   LEW xml  off 2 4      AttrValue Str Ind  or FFFFFFFF           int attrFlags   LEW xml  off 3 4               int attrResId   LEW xml  off 4 4       AttrValue ResourceId or dup AttrValue StrInd           off    5 4      Skip over the 5 words of an attribute            String attrName   compXmlString xml  sitOff  stOff  attrNameSi             String attrValue   attrValueSi  -1               compXmlString xml  sitOff  stOff  attrValueSi                 resourceID 0x  Integer toHexString attrResId             sb append     attrName       attrValue                    tr add attrName  attrValue                     resultXml append prtIndent indent    lt   name sb   gt              indent             else if  tag0    endTag       XML END TAG         indent--          off    6 4      Skip over 6 words of endTag data         String name   compXmlString xml  sitOff  stOff  nameSi           resultXml append prtIndent indent    lt    name   gt    line   startTagLineNo  -  lineNo                  tr parent        Step back up the NobTree          else if  tag0    endDocTag        END OF XML DOC TAG         break           else             Log e TAG     Unrecognized tag code    Integer toHexString tag0                at offset   off           break                   end of while loop scanning tags and attributes of XML tree     Log i TAG       end at offset   off        return resultXml toString         end of decompressXML          Tool Method for decompressXML       Compute binary XML to its string format     Source  Source  http   stackoverflow com questions 2097813 how-to-parse-the-androidmanifest-xml-file-inside-an-apk-package 4761689 4761689         param xml Binary-formatted XML     param sitOff     param stOff     param strInd     return String-formatted XML     public static String compXmlString byte   xml  int sitOff  int stOff  int strInd      if  strInd  lt  0  return null    int strOff   stOff   LEW xml  sitOff strInd 4     return compXmlStringAt xml  strOff              Tool Method for decompressXML        Apply indentation         param indent Indentation level     param str String to indent     return Indented string     public static String prtIndent int indent  String str         return  spaces substring 0  Math min indent 2  spaces length     str               Tool method for decompressXML      Return the string stored in StringTable format at    offset strOff   This offset points to the 16 bit string length  which     is followed by that number of 16 bit  Unicode  chars          param arr StringTable array     param strOff Offset to get string from     return String from StringTable at offset strOff         public static String compXmlStringAt byte   arr  int strOff      int strLen   arr strOff 1  lt  lt 8 amp 0xff00   arr strOff  amp 0xff    byte   chars   new byte strLen     for  int ii 0  ii lt strLen  ii          chars ii    arr strOff 2 ii 2         return new String chars       Hack  just use 8 byte chars      end of compXmlStringAt           Return value of a Little Endian 32 bit word from the byte array      at offset off          param arr Byte array with 32 bit word     param off Offset to get word from     return Value of Little Endian 32 bit word specified     public static int LEW byte   arr  int off      return arr off 3  lt  lt 24 amp 0xff000000   arr off 2  lt  lt 16 amp 0xff0000       arr off 1  lt  lt 8 amp 0xff00   arr off  amp 0xFF       end of LEW   Hope it can help other people too

User · Answer

it can be helpful   public static int vCodeApk String path        PackageManager pm   G context getPackageManager        PackageInfo info   pm getPackageArchiveInfo path  0       return info versionCode                Toast makeText this   VersionCode       info versionCode      VersionName       info versionName  Toast LENGTH LONG  show        G is my Application class    public class G extends Application

User · Answer

If your into Python or use Androguard  the Androguard Androaxml feature will do this conversion for you   The feature is detailed in this blog post  with additional documentation here and source here   Usage       androaxml py -h Usage  androaxml py  options   Options  -h  --help            show this help message and exit -i INPUT  --input INPUT                       filename input  APK or android s binary xml  -o OUTPUT  --output OUTPUT                       filename output of the xml -v  --version         version of the API      androaxml py -i yourfile apk -o output xml     androaxml py -i AndroidManifest xml -o output xml

User · Answer

In case it s useful  here s a C   version of the Java snippet posted by Ribo   struct decompressXML          decompressXML -- Parse the  compressed  binary form of Android XML docs         such as for AndroidManifest xml in  apk files     enum               endDocTag   0x00100101          startTag    0x00100102          endTag      0x00100103             decompressXML const BYTE  xml  int cb           Compressed XML file bytes starts with 24x bytes of data         9 32 bit words in little endian order  LSB first            0th word is 03 00 08 00          3rd word SEEMS TO BE   Offset at then of StringTable          4th word is  Number of strings in string table        WARNING  Sometime I indiscriminently display or refer to word in           little endian storage format  or in integer format  ie MSB first       int numbStrings   LEW xml  cb  4 4           StringIndexTable starts at offset 24x  an array of 32 bit LE offsets        of the length string data in the StringTable      int sitOff   0x24      Offset of start of StringIndexTable         StringTable  each string is represented with a 16 bit little endian         character count  followed by that number of 16 bit  LE   Unicode  chars      int stOff   sitOff   numbStrings 4      StringTable follows StrIndexTable         XMLTags  The XML tag tree starts after some unknown content after the        StringTable   There is some unknown data after the StringTable  scan        forward from this point to the flag for the start of an XML start tag      int xmlTagOff   LEW xml  cb  3 4       Start from the offset in the 3rd word         Scan forward until we find the bytes  0x02011000 x00100102 in normal int      for  int ii xmlTagOff  ii lt cb-4  ii  4          if  LEW xml  cb  ii     startTag             xmlTagOff   ii   break                   end of hack  scanning for start of first start tag         XML tags and attributes         Every XML start and end tag consists of 6 32 bit words           0th word  02011000 for startTag and 03011000 for endTag           1st word  a flag   like 38000000          2nd word  Line of where this tag appeared in the original source file          3rd word  FFFFFFFF             4th word  StringIndex of NameSpace name  or FFFFFFFF for default NS          5th word  StringIndex of Element Name           Note  01011000 in 0th word means end of XML document  endDocTag          Start tags  not end tags  contain 3 more words           6th word  14001400 meaning             7th word  Number of Attributes that follow this tag follow word 8th           8th word  00000000 meaning           Attributes consist of 5 words            0th word  StringIndex of Attribute Name s Namespace  or FFFFFFFF          1st word  StringIndex of Attribute Name          2nd word  StringIndex of Attribute Value  or FFFFFFF if ResourceId used          3rd word  Flags           4th word  str ind of attr value again  or ResourceId of value         TMP  dump string table to tr for debugging       tr addSelect  strings   null         for  int ii 0  ii lt numbStrings  ii                 Length of string starts at StringTable plus offset in StrIndTable         String str   compXmlString xml  sitOff  stOff  ii           tr add String valueOf ii   str                 tr parent            Step through the XML tree element tags and attributes     int off   xmlTagOff      int indent   0      int startTagLineNo   -2      while  off  lt  cb          int tag0   LEW xml  cb  off           int tag1   LEW xml  off 1 4         int lineNo   LEW xml  cb  off 2 4           int tag3   LEW xml  off 3 4         int nameNsSi   LEW xml  cb  off 4 4         int nameSi   LEW xml  cb  off 5 4          if  tag0    startTag       XML START TAG         int tag6   LEW xml  cb  off 6 4       Expected to be 14001400         int numbAttrs   LEW xml  cb  off 7 4       Number of Attributes to follow           int tag8   LEW xml  off 8 4       Expected to be 00000000         off    9 4      Skip over 6 3 words of startTag data         std  string name   compXmlString xml  cb  sitOff  stOff  nameSi             tr addSelect name  null           startTagLineNo   lineNo              Look for the Attributes         std  string sb          for  int ii 0  ii lt numbAttrs  ii                int attrNameNsSi   LEW xml  cb  off       AttrName Namespace Str Ind  or FFFFFFFF           int attrNameSi   LEW xml  cb  off 1 4       AttrName String Index           int attrValueSi   LEW xml  cb  off 2 4      AttrValue Str Ind  or FFFFFFFF           int attrFlags   LEW xml  cb  off 3 4               int attrResId   LEW xml  cb  off 4 4       AttrValue ResourceId or dup AttrValue StrInd           off    5 4      Skip over the 5 words of an attribute            std  string attrName   compXmlString xml  cb  sitOff  stOff  attrNameSi             std  string attrValue   attrValueSi  -1               compXmlString xml  cb  sitOff  stOff  attrValueSi                 resourceID 0x  toHexString attrResId             sb append     attrName       attrValue                    tr add attrName  attrValue                     prtIndent indent    lt   name sb   gt             indent             else if  tag0    endTag       XML END TAG         indent--          off    6 4      Skip over 6 words of endTag data         std  string name   compXmlString xml  cb  sitOff  stOff  nameSi           prtIndent indent    lt    name   gt    line   toIntString startTagLineNo   -  toIntString lineNo                  tr parent        Step back up the NobTree          else if  tag0    endDocTag        END OF XML DOC TAG         break           else           prt    Unrecognized tag code    toHexString tag0                at offset   toIntString off            break                   end of while loop scanning tags and attributes of XML tree     prt      end at offset   off            end of decompressXML       std  string compXmlString const BYTE  xml  int cb  int sitOff  int stOff  int strInd          if  strInd  lt  0  return std  string            int strOff   stOff   LEW xml  cb  sitOff strInd 4         return compXmlStringAt xml  cb  strOff              void prt std  string str                printf   s   str c str               void prtIndent int indent  std  string str            char spaces 46           memset spaces       sizeof spaces            spaces min indent 2   sizeof spaces  - 1     0          prt spaces           prt str           prt   n                   compXmlStringAt -- Return the string stored in StringTable format at        offset strOff   This offset points to the 16 bit string length  which         is followed by that number of 16 bit  Unicode  chars      std  string compXmlStringAt const BYTE  arr  int cb  int strOff            if  cb  lt  strOff   2  return std  string            int strLen   arr strOff 1  lt  lt 8 amp 0xff00   arr strOff  amp 0xff        char  chars   new char strLen   1         chars strLen    0        for  int ii 0  ii lt strLen  ii                if  cb  lt  strOff   2   ii   2                            chars ii    0                break                      chars ii    arr strOff 2 ii 2                 std  string str chars         free chars         return str           end of compXmlStringAt          LEW -- Return value of a Little Endian 32 bit word from the byte array          at offset off      int LEW const BYTE  arr  int cb  int off          return  cb  gt  off   3      arr off 3  lt  lt 24 amp 0xff000000   arr off 2  lt  lt 16 amp 0xff0000             arr off 1  lt  lt 8 amp 0xff00   arr off  amp 0xFF     0           end of LEW      std  string toHexString DWORD attrResId                char ch 20           sprintf s ch  20    lx   attrResId           return std  string ch             std  string toIntString int i                char ch 20           sprintf s ch  20    ld   i           return std  string ch

User · Answer

This Java method  that runs on an Android  documents  what I ve been able to interpret about  the binary format of the  AndroidManifest xml file in the  apk package   The second code box shows how to call decompressXML and how to load the byte   from the app package file on the device    There are fields whose purpose I don t understand  if you know what they mean  tell me  I ll update the info       decompressXML -- Parse the  compressed  binary form of Android XML docs     such as for AndroidManifest xml in  apk files public static int endDocTag   0x00100101  public static int startTag    0x00100102  public static int endTag      0x00100103  public void decompressXML byte   xml       Compressed XML file bytes starts with 24x bytes of data     9 32 bit words in little endian order  LSB first        0th word is 03 00 08 00      3rd word SEEMS TO BE   Offset at then of StringTable      4th word is  Number of strings in string table    WARNING  Sometime I indiscriminently display or refer to word in       little endian storage format  or in integer format  ie MSB first   int numbStrings   LEW xml  4 4       StringIndexTable starts at offset 24x  an array of 32 bit LE offsets    of the length string data in the StringTable  int sitOff   0x24      Offset of start of StringIndexTable     StringTable  each string is represented with a 16 bit little endian     character count  followed by that number of 16 bit  LE   Unicode  chars  int stOff   sitOff   numbStrings 4      StringTable follows StrIndexTable     XMLTags  The XML tag tree starts after some unknown content after the    StringTable   There is some unknown data after the StringTable  scan    forward from this point to the flag for the start of an XML start tag  int xmlTagOff   LEW xml  3 4       Start from the offset in the 3rd word     Scan forward until we find the bytes  0x02011000 x00100102 in normal int  for  int ii xmlTagOff  ii lt xml length-4  ii  4      if  LEW xml  ii     startTag         xmlTagOff   ii   break           end of hack  scanning for start of first start tag     XML tags and attributes     Every XML start and end tag consists of 6 32 bit words       0th word  02011000 for startTag and 03011000 for endTag       1st word  a flag   like 38000000      2nd word  Line of where this tag appeared in the original source file      3rd word  FFFFFFFF         4th word  StringIndex of NameSpace name  or FFFFFFFF for default NS      5th word  StringIndex of Element Name       Note  01011000 in 0th word means end of XML document  endDocTag      Start tags  not end tags  contain 3 more words       6th word  14001400 meaning         7th word  Number of Attributes that follow this tag follow word 8th       8th word  00000000 meaning       Attributes consist of 5 words        0th word  StringIndex of Attribute Name s Namespace  or FFFFFFFF      1st word  StringIndex of Attribute Name      2nd word  StringIndex of Attribute Value  or FFFFFFF if ResourceId used      3rd word  Flags       4th word  str ind of attr value again  or ResourceId of value     TMP  dump string table to tr for debugging   tr addSelect  strings   null     for  int ii 0  ii lt numbStrings  ii             Length of string starts at StringTable plus offset in StrIndTable     String str   compXmlString xml  sitOff  stOff  ii       tr add String valueOf ii   str         tr parent        Step through the XML tree element tags and attributes int off   xmlTagOff  int indent   0  int startTagLineNo   -2  while  off  lt  xml length      int tag0   LEW xml  off       int tag1   LEW xml  off 1 4     int lineNo   LEW xml  off 2 4       int tag3   LEW xml  off 3 4     int nameNsSi   LEW xml  off 4 4     int nameSi   LEW xml  off 5 4      if  tag0    startTag       XML START TAG     int tag6   LEW xml  off 6 4       Expected to be 14001400     int numbAttrs   LEW xml  off 7 4       Number of Attributes to follow       int tag8   LEW xml  off 8 4       Expected to be 00000000     off    9 4      Skip over 6 3 words of startTag data     String name   compXmlString xml  sitOff  stOff  nameSi         tr addSelect name  null       startTagLineNo   lineNo          Look for the Attributes     StringBuffer sb   new StringBuffer        for  int ii 0  ii lt numbAttrs  ii            int attrNameNsSi   LEW xml  off       AttrName Namespace Str Ind  or FFFFFFFF       int attrNameSi   LEW xml  off 1 4       AttrName String Index       int attrValueSi   LEW xml  off 2 4      AttrValue Str Ind  or FFFFFFFF       int attrFlags   LEW xml  off 3 4           int attrResId   LEW xml  off 4 4       AttrValue ResourceId or dup AttrValue StrInd       off    5 4      Skip over the 5 words of an attribute        String attrName   compXmlString xml  sitOff  stOff  attrNameSi         String attrValue   attrValueSi  -1           compXmlString xml  sitOff  stOff  attrValueSi             resourceID 0x  Integer toHexString attrResId         sb append     attrName       attrValue                tr add attrName  attrValue             prtIndent indent    lt   name sb   gt         indent         else if  tag0    endTag       XML END TAG     indent--      off    6 4      Skip over 6 words of endTag data     String name   compXmlString xml  sitOff  stOff  nameSi       prtIndent indent    lt    name   gt    line   startTagLineNo  -  lineNo             tr parent        Step back up the NobTree      else if  tag0    endDocTag        END OF XML DOC TAG     break       else       prt    Unrecognized tag code    Integer toHexString tag0            at offset   off       break           end of while loop scanning tags and attributes of XML tree prt      end at offset   off        end of decompressXML   public String compXmlString byte   xml  int sitOff  int stOff  int strInd      if  strInd  lt  0  return null    int strOff   stOff   LEW xml  sitOff strInd 4     return compXmlStringAt xml  strOff       public static String spaces                                                    public void prtIndent int indent  String str      prt spaces substring 0  Math min indent 2  spaces length     str          compXmlStringAt -- Return the string stored in StringTable format at    offset strOff   This offset points to the 16 bit string length  which     is followed by that number of 16 bit  Unicode  chars  public String compXmlStringAt byte   arr  int strOff      int strLen   arr strOff 1  lt  lt 8 amp 0xff00   arr strOff  amp 0xff    byte   chars   new byte strLen     for  int ii 0  ii lt strLen  ii          chars ii    arr strOff 2 ii 2         return new String chars       Hack  just use 8 byte chars      end of compXmlStringAt      LEW -- Return value of a Little Endian 32 bit word from the byte array      at offset off  public int LEW byte   arr  int off      return arr off 3  lt  lt 24 amp 0xff000000   arr off 2  lt  lt 16 amp 0xff0000       arr off 1  lt  lt 8 amp 0xff00   arr off  amp 0xFF       end of LEW   This method reads the AndroidManifest into a byte   for processing   public void getIntents String path      try       JarFile jf   new JarFile path       InputStream is   jf getInputStream jf getEntry  AndroidManifest xml         byte   xml   new byte is available         int br   is read xml         Tree tr   TrunkFactory newTree        decompressXML xml         prt  XML n  tr list         catch  Exception ex        console log  getIntents  ex    ex    ex printStackTrace             end of getIntents   Most apps are stored in  system app  which is readable without root my Evo  other apps are in  data app which I needed root to see   The  path  argument above would be something like    system app Weather apk

[android] How to parse the AndroidManifest.xml file inside an .apk package

Examples related to android

Examples related to android-manifest