ๅฃ“ๅ€’
Ye0L
ๅฃ“ๅ€’
Total
Home
Today
  • ByO (59)
    • ์•…์„ฑ์ฝ”๋“œ๋ถ„์„๐Ÿณ (10)
      • Android (2)
      • Windows (8)
    • ๋ฐฑ์ค€โŒจ (1)
    • ๊ฐœ๋ฐœ๐Ÿ’ป (14)
      • Python (14)
      • C (0)
    • AI (2)
      • ML (2)
      • DL (0)
    • Dreamhack (23)
      • Web hacking (20)
      • Reversing (3)
    • CTF (5)
      • Forensic (5)
    • Study (4)
      • Linux (2)
      • Web (2)

๋ธ”๋กœ๊ทธ ๋ฉ”๋‰ด

  • ํ™ˆ
  • ๋ฐฉ๋ช…๋ก
  • ๊นƒํ—ˆ๋ธŒ๐Ÿ˜€
  • ํ‹ฐ์Šคํ† ๋ฆฌ ํ™ˆ

๊ณต์ง€์‚ฌํ•ญ

์ธ๊ธฐ ๊ธ€

ํƒœ๊ทธ

  • Raccoon Stealer
  • forensic
  • Raccoon Stealer v2
  • web hacking
  • selenium
  • ์•…์„ฑ์ฝ”๋“œ๋ถ„์„
  • Crawling
  • File Download Vulnerability
  • ๋ฆฌ๋ฒ„์‹ฑ
  • ClientSide: XSS
  • ๋“œ๋ฆผํ•ต
  • [DigitalForensic]with CTF
  • Background:Web Browser
  • APK ๊ตฌ์กฐ
  • CSRF
  • reversing
  • Background: Cookie & Session
  • ํฌ๋กค๋ง
  • dreamhack
  • PYTHON

์ตœ๊ทผ ๋Œ“๊ธ€

์ตœ๊ทผ ๊ธ€

ํ‹ฐ์Šคํ† ๋ฆฌ

hELLO ยท Designed By ์ •์ƒ์šฐ.
ๅฃ“ๅ€’

Ye0L

๊ฐœ๋ฐœ๐Ÿ’ป/Python

[Python] Selenium์„ ์‚ฌ์šฉํ•œ ํฌ๋กค๋ง ์ œ์ž‘(2)

2022. 9. 30. 02:26

Selenium์„ ์‚ฌ์šฉํ•œ ํฌ๋กค๋ง ์ œ์ž‘(1)

๊ธฐ์กด Selenium์„ ์‚ฌ์šฉํ•˜์—ฌ ๋งŒ๋“ค์—ˆ๋˜ ํฌ๋กค๋ง์—๋Š” ์น˜๋ช…์ ์ธ ๋‹จ์ ์ด ํ•˜๋‚˜ ์กด์žฌํ•˜์˜€๋‹ค.

2022.09.07 - [๊ฐœ๋ฐœ๐Ÿ’ป/Python] - [Python] Selenium์„ ์‚ฌ์šฉํ•œ ํฌ๋กค๋ง ์ œ์ž‘(1)

 

๊ธฐ์กด ํฌ๋กค๋ง์˜ ๋ฌธ์ œ

๊ธฐ์กด ๋งŒ๋“ค์—ˆ๋˜ ํฌ๋กค๋ง์€ ์ค‘๋ณต ์ œ๊ฑฐ ๋กœ์ง์ด ์กด์žฌํ•˜์ง€ ์•Š์•„ ๊ฐ™์€ ํŒŒ์ผ๋„ ๊ณ„์† ๋‹ค์šด๋กœ๋“œ๋˜์–ด ์‹ค์งˆ์ ์œผ๋กœ ์›ํ•˜๋Š” ์•…์„ฑ APK ํŒŒ์ผ์„ ๋‹ค์šด๋กœ๋“œ ํ•  ์ˆ˜ ์—†์—ˆ๊ธฐ์— ์ค‘๋ณต ์ œ๊ฑฐ ๋กœ์ง์„ ์ถ”๊ฐ€ํ•˜๊ธฐ๋กœ ํ•˜์˜€๋‹ค.

 

์ค‘๋ณต ์ œ๊ฑฐ ๋กœ์ง 

MD5 ํ•ด์‹œ๊ฐ’ ๋น„๊ต๋ฅผ ํ†ตํ•ด ์ค‘๋ณต ๊ฒ€์‚ฌ๋ฅผ ํ•˜๊ธฐ๋กœ ๊ฒฐ์ •ํ•˜์—ฌ List ๋กœ ์„ ์–ธ ํ›„ ์ „์ฒด์ ์ธ ์ฝ”๋“œ๋ฅผ ๊ตฌํ˜„ํ•˜์˜€๋‹ค.

ํ•˜์ง€๋งŒ, ๋งค๋ฒˆ List๊ฐ€ ์ดˆ๊ธฐํ™”๋˜์–ด ์ด์ „์— ๋‹ค์šด๋กœ๋“œ ํ–ˆ๋˜ APK์˜ MD5 ํ•ด์‹œ๊ฐ’์ด ๋ˆ„์ ๋˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ๊ทธ ์‹œ์ (์ฝ”๋“œ๋ฅผ ๋Œ๋ฆด ๋•Œ)์—๋งŒ ์ค‘๋ณต์ œ๊ฑฐ๊ฐ€ ์ด๋ฃจ์–ด์กŒ๋‹ค..๐Ÿ˜“

 

๊ทธ๋ž˜์„œ MySQL DB๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ APK์˜ MD5 ํ•ด์‹œ๊ฐ’์„ ๋ˆ„์ ํ•˜์—ฌ ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ ์ƒ๊ฐํ•˜๊ณ  ๊ตฌํ˜„ํ•˜์˜€๋‹ค.

 

Pymysql - ์„ค์น˜

Python ๋‚ด MySQL๊ณผ ์—ฐ๋™๋˜๋Š” ๋ชจ๋“ˆ์ธ pymysql์„ ์‚ฌ์šฉํ•˜์—ฌ DB๋ฅผ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค.

pymysql์€ ๊ฐ„๋‹จํ•˜๊ฒŒ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ช…๋ น์–ด๋กœ ์„ค์น˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

pip install pymysql

 

Pymysql - DB ์ƒ์„ฑ

MySQL DB ์ƒ์„ฑ์€ ์›Œํฌ๋ฒค์น˜๋กœ๋„ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ vscode๋ฅผ ์‚ฌ์šฉํ•ด๋„ ์ƒ์„ฑ ๊ฐ€๋Šฅํ•˜๋‹ค.

์ฒ˜์Œ์— ์ž์‹ ์˜ MySQL ๊ณผ ์—ฐ๊ฒฐํ•œ ํ›„ cursor ๊ฐ์ฒด๋ฅผ ํ•˜๋‚˜ ์ƒ์„ฑํ•œ๋‹ค.

import pymysql

conn = pymysql.connect(host='localhost', user = 'root', password='๋น„๋ฐ€๋ฒˆํ˜ธ', charset='utf8')
cursor = conn.cursor()

 

์ดํ›„ DB ์ƒ์„ฑ ์ฟผ๋ฆฌ๋ฌธ์ธ "CREATE DATABASE" ์ฟผ๋ฆฌ๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ DB๋ฅผ ์ƒ์„ฑํ•œ๋‹ค.

sql = "CREATE DATABASE malware_apk"

 

ํ•ด๋‹น ์ฟผ๋ฆฌ๋ฌธ์„ ์™„์„ฑํ–ˆ์œผ๋ฉด ์ฟผ๋ฆฌ๋ฌธ์„ ์‹คํ–‰ํ•˜๊ณ  ์ปค๋ฐ‹์„ ํ•˜์—ฌ DB ์ƒ์„ฑ์„ ๋งˆ๋ฌด๋ฆฌ ํ•œ๋‹ค.(ํ…Œ์ด๋ธ” ์ƒ์„ฑ๋„ ๋™์ผํ•˜๊ฒŒ ์ฟผ๋ฆฌ๋ฌธ์„ ์ž‘์„ฑํ•˜๊ณ  ์‹คํ–‰ํ•˜๋ฉด ๋œ๋‹ค.)

sql = "CREATE DATABASE 'DB ๋ช…'"

cursor.execute(sql)

conn.commit()
conn.close()

 

Pymysql - ๊ตฌํ˜„

ํ…Œ์ด๋ธ” ์ƒ์„ฑ๊นŒ์ง€ ์™„๋ฃŒํ•œ ํ›„ ์‹ค์ œ ์ค‘๋ณต ์ œ๊ฑฐ ๋กœ์ง์„ ๊ตฌํ˜„ํ•˜์˜€๋‹ค.

์ฒ˜์Œ MySQL๊ณผ ์—ฐ๊ฒฐํ•˜๊ณ  ๊ธฐ์กด ํ•ด์‹œ๊ฐ’(SELET '๊ฐ€์ ธ์˜ค๊ณ  ์‹ถ์€ ์นผ๋Ÿผ ๋ช…' FROM 'ํ…Œ์ด๋ธ” ๋ช…')์„ ๋ฆฌ์ŠคํŠธ์— ๋„ฃ์–ด ์ด๋ฏธ ๋‹ค์šด๋กœ๋“œ ํ–ˆ๋˜ APK ํŒŒ์ผ์— ๋Œ€ํ•ด์„œ ํŒŒ์•…ํ•œ๋‹ค. 

conn = pymysql.connect(host='localhost', user = 'root', password='๋น„๋ฐ€๋ฒˆํ˜ธ!',db = 'malware_apk', charset='utf8')
cursor = conn.cursor()

# ๊ธฐ์กด ํ•ด์‹œ๊ฐ’ ๊ฐ€์ ธ์˜ค๊ธฐ
file_hash = []
cursor.execute("SELECT * FROM md5")
for q in cursor.fetchall():
    temp = q[1]
    file_hash.append(temp)
file_name = []

 

๊ธฐ์กด ํ•ด์‹œ๊ฐ’์„ ๊ฐ€์ ธ์˜จ ์ดํ›„ Selenium์˜ Xpath ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์šดํ•˜๊ณ ์ž ํ•˜๋Š” APK ํŒŒ์ผ์˜ MD5ํ•ด์‹œ๊ฐ’์„ ๊ฐ€์ ธ์˜จ๋‹ค.

file_md5 = driver.find_element(By.ID, 'md5_hash')

 

๋‹ค์šดํ•˜๊ณ ์ž ํ•˜๋Š” APK ํŒŒ์ผ์˜ MD5 ํ•ด์‹œ๊ฐ’์ด ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค์—์„œ ๋ถˆ๋Ÿฌ์˜จ ๋ฆฌ์ŠคํŠธ์— ์ด๋ฏธ ์กด์žฌํ•œ๋‹ค๋ฉด continue๋ฅผ ํ†ตํ•ด for ๋ฌธ์„ ์ƒ๋žตํ•˜๊ณ  ์กด์žฌํ•˜์ง€ ์•Š์„ ๊ฒฝ์šฐ ๋‹ค์šด๋กœ๋“œํ•˜๋Š” ๋ฃจํ‹ด๊นŒ์ง€ ์ง„ํ–‰๋˜๋„๋ก ๊ตฌํ˜„ํ•˜์˜€๋‹ค.

 if file_md5.text in file_hash:
    print(file_name[h] + "๊ฐ€ ์ด๋ฏธ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.")
    continue

file_hash.append(file_md5.text)
sql = "INSERT INTO md5(md5) values(%s)"
val = (file_hash[h])
cursor.execute(sql, val)
# ์ค‘๋ณต ์•„๋‹ ๊ฒฝ์šฐ ๋‹ค์šด ์ง„ํ–‰ 
driver.find_element(By.XPATH, '/html/body/main/table/tbody/tr[7]/td/a').click()

 

์ „์ฒด ์ฝ”๋“œ

์ „์ฒด ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์œผ๋ฉฐ ๊นƒํ—ˆ๋ธŒ์—์„œ๋„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

https://github.com/byeongyeolahn/Bazaar_Mobile_malware_crawling

import re
import requests
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
import unzip
import md5
import pymysql



def crawling():
    #my sql ์—ฐ๊ฒฐ
    conn = pymysql.connect(host='localhost', user = 'root', password='๋น„๋ฐ€๋ฒˆํ˜ธ',db = 'malware_apk', charset='utf8')
    cursor = conn.cursor()

    # ๊ธฐ์กด ํ•ด์‹œ๊ฐ’ ๊ฐ€์ ธ์˜ค๊ธฐ
    file_hash = []
    cursor.execute("SELECT * FROM md5")
    for q in cursor.fetchall():
        temp = q[1]
        file_hash.append(temp)
    file_name = []


    options = Options()

    # ์˜ต์…˜ ์„ค์ •
    user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
    options.add_argument('user-agent=' + user_agent)
    options.add_experimental_option("excludeSwitches", ["enable-logging"])
    options.add_experimental_option('prefs', {
        "download.default_directory": "๋‹ค์šด๋กœ๋“œ ๋ฐ›๊ณ ์ž ํ•˜๋Š” ํด๋” ๊ฒฝ๋กœ",
        "download.prompt_for_download": False,
        "download.directory_upgrade": True,
        "safebrowsing.enabled": True})
    driver = webdriver.Chrome(options=options)

    # ์ดˆ๊ธฐ ๊ฒ€์ƒ‰์–ด ์ž…๋ ฅ
    query = 'tag:apk'

    # URL ์ ‘์†
    url1 = 'ํฌ๋กค๋ง ๋Œ€์ƒ ์‚ฌ์ดํŠธ'
    driver.get(url1)
    time.sleep(1)

    # apk ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ํ™”๋ฉด
    driver.find_element(By.XPATH, '/html/body/main/div[1]/div/p[2]/a').click()
    search_tab = driver.find_element(By.CSS_SELECTOR, '#search')
    search_tab.send_keys(query)
    search_tab.send_keys(Keys.ENTER)
    time.sleep(3)

    # apk ํŒŒ์ผ๋“ค sha256 ๊ฐ’ ๋ชจ์œผ๊ธฐ


 
    # for i in range(1,4):
    for k in range(2,6):
        driver.find_element(By.XPATH, '//*[@id="samples_paginate"]/ul/li[{}]/a'.format(k)).click()
        for i in range(1,251):
            tag_td = driver.find_element(By.CSS_SELECTOR, '#samples > tbody > tr:nth-child({}) > td:nth-child(2) > a'.format(i))
            tag_href = tag_td.get_attribute('href')
            file_name.append(tag_href)

    for h in range(len(file_name)):
        url_list = file_name[h]
        driver.get(url_list)
        # ์ค‘๋ณต ๊ฒ€์‚ฌ
        file_md5 = driver.find_element(By.ID, 'md5_hash')
        if file_md5.text in file_hash:
            print(file_name[h] + "๊ฐ€ ์ด๋ฏธ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.")
            continue

        file_hash.append(file_md5.text)
        sql = "INSERT INTO md5(md5) values(%s)"
        val = (file_hash[h])
        cursor.execute(sql, val)
        # ์ค‘๋ณต ์•„๋‹ ๊ฒฝ์šฐ ๋‹ค์šด ์ง„ํ–‰ 
        driver.find_element(By.XPATH, '/html/body/main/table/tbody/tr[7]/td/a').click()
        
        tag_div = driver.find_element(By.CSS_SELECTOR, 'body > main > div.container.text-center')
        tag_button = tag_div.find_element(By.TAG_NAME, 'button')
        tag_id = tag_button.get_attribute('id')
        driver.find_element(By.XPATH, '//*[@id="{}"]'.format(tag_id)).click()
        conn.commit()
        time.sleep(1)

    conn.close()

if __name__ == "__main__":
    #๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์—ฐ๊ฒฐ
    crawling()

*ํ•ด๋‹น ํฌ๋กค๋ง ๋‚ด Xpath, Selector์€ ์‚ฌ์ดํŠธ ๋‚ด ํŠน์ • ๊ฐ’์œผ๋กœ ์‚ฌ์ดํŠธ๋งˆ๋‹ค ๊ฐ’์€ ๋ชจ๋‘ ๋‹ค๋ฅด๋‹ค.

 

์˜ค๋ฅ˜, ์ž˜๋ชป๋œ ์  ๋˜๋Š” ๊ถ๊ธˆํ•œ ์ ์ด ์žˆ์œผ์‹œ๋‹ค๋ฉด ๋Œ“๊ธ€ ๋‚จ๊ฒจ์ฃผ์„ธ์š”โ—

'๊ฐœ๋ฐœ๐Ÿ’ป > Python' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Python] ๊ฐ€์ƒ ํ™˜๊ฒฝ ์‚ฌ์šฉํ•˜๊ธฐ(venv)  (0) 2022.10.05
[Python] ZipํŒŒ์ผ ์••์ถ• ํ•ด์ œ(2)(๋น„๋ฐ€๋ฒˆํ˜ธ ํฌํ•จ)  (0) 2022.10.04
[Python] pymysql ์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค ๊ฐ’(Tuple)์„ List๋กœ ๊ฐ€์ ธ์˜ค๋Š” ๋ฒ•  (0) 2022.09.27
[Python] ZipํŒŒ์ผ ์••์ถ• ํ•ด์ œ(1)(๋น„๋ฐ€๋ฒˆํ˜ธ ํฌํ•จ)  (0) 2022.09.17
[Python] Selenium ์‚ฌ์šฉํ•˜๊ธฐ  (0) 2022.09.09
    '๊ฐœ๋ฐœ๐Ÿ’ป/Python' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
    • [Python] ๊ฐ€์ƒ ํ™˜๊ฒฝ ์‚ฌ์šฉํ•˜๊ธฐ(venv)
    • [Python] ZipํŒŒ์ผ ์••์ถ• ํ•ด์ œ(2)(๋น„๋ฐ€๋ฒˆํ˜ธ ํฌํ•จ)
    • [Python] pymysql ์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ๋ฒ ์ด์Šค ๊ฐ’(Tuple)์„ List๋กœ ๊ฐ€์ ธ์˜ค๋Š” ๋ฒ•
    • [Python] ZipํŒŒ์ผ ์••์ถ• ํ•ด์ œ(1)(๋น„๋ฐ€๋ฒˆํ˜ธ ํฌํ•จ)
    ๅฃ“ๅ€’
    ๅฃ“ๅ€’
    ์•…์„ฑ์ฝ”๋“œ ๋ถ„์„, ๊ฐœ๋ฐœ, ๋ฐฑ์ค€ ๋“ฑ ๋‚˜์˜ ์ผ์ง€

    ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”