# 🔍 REVERSE ENGINEER AUMENTUM WEB ACCESS

## 🎯 Goal
Find out how Aumentum Web Access correctly discovers all 46 pages for PL11089, so we can replicate it in our Python code.

---

## 📋 **METHOD 1: Browser Console (EASIEST)**

### Steps:

1. **Open Aumentum Web Access**
   ```
   http://10.10.10.3:8080/lrswa
   Login: admin / admin
   ```

2. **Search for and open PL11089**
   - Click on Type 103 (Property File - 46 pages)

3. **Open Developer Console**
   - Press `F12` (Windows/Linux)
   - Or right-click → "Inspect" → "Console" tab

4. **Run the JavaScript**
   - Copy the entire contents of `reverse_engineer_web_access.js`
   - Paste into the Console
   - Press Enter

5. **Copy the output**
   - The script will print all UUIDs it finds
   - Copy everything and send it back

### What to Look For:
```javascript
Page 1:
  UUID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
  Name: page001.jpg
  ...
```

---

## 📋 **METHOD 2: Network Inspector (BEST)**

### Steps:

1. **Open Aumentum Web Access**
   ```
   http://10.10.10.3:8080/lrswa
   ```

2. **Open DevTools BEFORE searching**
   - Press `F12`
   - Go to **Network** tab
   - Check "Preserve log"

3. **Search for PL11089**
   - Enter search: PL11089
   - Click on Type 103 (Property File)

4. **Find these requests in Network tab:**

   **A) Document Search Request:**
   ```
   Request: GET /lrswa/search.do?entityType=SourceDocument&documentNumber=PL11089
   ```
   - Click on it
   - Go to "Response" tab
   - **Copy the entire response**
   - Save as `search_response.json`

   **B) Document Details Request:**
   ```
   Request: GET /lrswa/details/document.do?id=10000000013791&v=...
   ```
   - Click on it
   - Go to "Response" tab
   - **Copy the entire response**
   - Save as `document_details.json` or `.html`

   **C) Individual Page Requests:**
   ```
   Request: GET /lrswa/documentPage.do?uuid=xxxxxxxx-xxxx-...
   ```
   - Note the UUIDs being requested
   - Copy the first 5-10 UUIDs

5. **Send all captured data:**
   - `search_response.json`
   - `document_details.json` or `.html`
   - List of UUIDs from page requests

### What We're Looking For:

We need to find WHERE the full list of 46 UUIDs comes from:
- ✅ Is it in the search response?
- ✅ Is it in the details response?
- ✅ Is it loaded via a separate AJAX call?
- ✅ Is it embedded in the HTML?

---

## 📋 **METHOD 3: Python Script**

### Steps:

1. **Run the capture script:**
   ```bash
   cd /home/plagis/workspace/plagis_aumentum
   python3 capture_web_access_api.py
   ```

2. **Check the output files:**
   ```bash
   cat /tmp/search_response.json
   cat /tmp/document_details.json
   cat /tmp/extracted_uuids.txt
   ```

3. **Send the files:**
   - Copy the contents of these files
   - Send them back for analysis

---

## 📋 **METHOD 4: Network Packet Capture**

If the above don't work, we can capture actual HTTP traffic:

### Using tcpdump:

```bash
# On the Aumentum server or your machine
sudo tcpdump -i any -s 0 -w /tmp/webaccess.pcap 'host 10.10.10.3 and port 8080'

# Then use Web Access to open PL11089
# Stop tcpdump (Ctrl+C)

# Analyze with:
tcpdump -A -r /tmp/webaccess.pcap | grep -A50 "document.do"
```

---

## 🔍 **WHAT WE NEED TO FIND**

The critical question: **How does Web Access get the list of all 46 UUIDs?**

### Possible Answers:

**Option A: All UUIDs in one response**
```json
{
  "pages": [
    {"uuid": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "name": "page001.jpg"},
    {"uuid": "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy", "name": "page002.jpg"},
    ...
  ]
}
```

**Option B: UUIDs in HTML/JavaScript**
```html
<script>
var pages = [
  {uuid: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"},
  {uuid: "yyyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy"},
  ...
];
</script>
```

**Option C: Separate AJAX endpoint**
```
GET /lrswa/getPages.do?documentId=10000000013791
→ Returns list of UUIDs
```

**Option D: Server-side logic we can't see**
```java
// Java servlet does:
// 1. Get document_id = 10000000013791
// 2. Query: SELECT ... WHERE entityid = ? OR referenceid = ?
// 3. Apply some discovery algorithm
// 4. Return list of UUIDs
```

---

## 🎯 **CRITICAL DATA TO CAPTURE**

Please capture and send:

1. **✅ Full list of UUIDs** Web Access uses for PL11089 Type 103 (all 46)
2. **✅ The API endpoint** that returns this list
3. **✅ The response format** (JSON, HTML, etc.)
4. **✅ Any query parameters** used (document_id, transaction_id, etc.)

### Example of what we need:

```
API: GET /lrswa/details/document.do?id=10000000013791

Response:
{
  "documentId": 10000000013791,
  "documentNumber": "PL11089",
  "documentType": 103,
  "pageCount": 46,
  "pages": [
    {"uuid": "3eee6f3f-0b98-41b9-a6cb-2c4488152fed"},
    {"uuid": "eac6561d-ae69-4a21-9923-c2a488eac8f3"},
    ...
  ]
}
```

Once we have this, we can:
1. See the exact UUIDs Web Access uses
2. Compare with what our algorithm returns
3. Understand why our algorithm picks wrong files
4. Fix our implementation to match

---

## 🚀 **QUICK START**

**Fastest way:** Method 2 (Network Inspector)

1. Open Web Access
2. Press F12 → Network tab
3. Search PL11089
4. Find `/details/document.do` request
5. Copy the Response
6. Send it back

**This should take < 2 minutes!** ⏱️

---

## 📞 **WHAT TO SEND BACK**

Send any of these:
- ✅ Screenshot of Network tab showing requests
- ✅ Copy of JSON/HTML response from `/details/document.do`
- ✅ List of UUIDs from browser console script
- ✅ Output from Python capture script

Any of these will help us understand how Web Access discovers the pages!

---

**GOAL:** Find the **EXACT UUIDs** Web Access uses for PL11089 Type 103 (Property File - 46 pages), so we can match them! 🎯

