Mod Archive Forums

Website => Help & Support => Feedback & Suggestions => Topic started by: olexander on August 07, 2024, 22:32:53

Title: python script for scrapping mod files from site to m3u
Post by: olexander on August 07, 2024, 22:32:53
Hello, apologize if this is incorrect action here :)
I am with help of google search and google gemini try make python script for scrapping mod files from site to m3u(direct url links for mod files) to listen directly at media players.
Here example for "Counrty" genre
Code: [Select]
base_url = 'https://modarchive.org/index.php?query=18&request=search&search_type=genre&'Here two pages - so
Code: [Select]
num_pages = 2make changes and launch(i am at linux)
Code: [Select]
./<namescript>.py > country.m3uwait, ok, add country.m3u to your madia player, bit wait for loading, try listen :)

Code: [Select]
#!/usr/bin/env python3

import requests
from bs4 import BeautifulSoup



def scrape_urls(base_url, num_pages):
  all_urls = []
  for page in range(1, num_pages + 1):
    url = f"{base_url}page={page}#mods"  # Adjust URL pattern as needed
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    links = soup.find_all('a')
    for link in links:
      href = link.get('href')
      if href and href.startswith('https:'):

        all_urls.append(href)

  return all_urls

# Example usage
# chillout
# base_url = 'https://modarchive.org/index.php?query=106&request=search&search_type=genre&'
# minimal
# base_url = 'https://modarchive.org/index.php?query=101&request=search&search_type=genre&'
# country
base_url = 'https://modarchive.org/index.php?query=18&request=search&search_type=genre&'
num_pages = 2
all_urls = scrape_urls(base_url, num_pages)
for url in all_urls:
    print(url)