# 🔴 NEW ISSUE: PL11089 Now Showing BP102 Content

## 🔍 What's Happening

After applying the workaround that swaps PL11089 and PL689 file associations, PL11089 is now showing **BP102 content** instead.

## 📊 Timeline of Events

1. **Original Problem**: PL11089 → showed PL689 content
2. **Workaround Applied**: Swap PL11089 to use PL689's file (`879dcd53-f552...275.bin`)
3. **New Problem**: PL11089 → now shows BP102 content

## 💡 Root Cause Analysis

This reveals a **deeper problem**: The file `879dcd53-f552-4e82-858f-7e868e60a275.bin` (originally tagged as PL689) actually contains **BP102 content**, not PL689 OR PL11089 content!

### Database Association Chain:

```
Database Says:
  PL11089 → 3eee6f3f...fed.bin
  PL689   → 879dcd53...275.bin  
  BP102   → ???

File Contents:
  3eee6f3f...fed.bin → PL689 content
  879dcd53...275.bin → BP102 content  ← SURPRISE!
  ???                → PL11089 content (WHERE IS IT?)
```

## 🔍 Investigation Needed

We need to find out:

1. **Where is the REAL PL11089 file?**
   - It's not the file currently tagged as PL11089
   - It's not the file currently tagged as PL689
   - Where is it?

2. **What file is BP102 supposed to use?**
   - BP102 is currently using 879dcd53...275.bin (tagged as PL689)
   - Is there another file for BP102?

3. **How many documents are affected?**
   - Is this just PL11089, PL689, and BP102?
   - Or are there many more wrong associations?

## 🧪 Diagnostic Steps

### Step 1: Check Current State

```bash
cd /home/plagis/workspace/plagis_aumentum
chmod +x check_pl11089_current_state.sh
./check_pl11089_current_state.sh
```

This will:
- Query the API for PL11089
- Generate a PDF
- Tell you what document number appears in the PDF

### Step 2: Check What's in Each File

We need to manually verify the content of these files:

```bash
# The API server should have created preview PDFs when the workaround ran
# Check for these files:
ls -la /tmp/verify_*.pdf
ls -la /tmp/test_*.pdf

# Open them to see what document number each contains
```

### Step 3: Query BP102 to See What It Returns

```bash
curl "http://localhost:8001/documents/by-document-number?document_number=BP102" | jq
```

Check if BP102's associations match the file that PL11089 is now using.

### Step 4: Search for PL11089 in All Files

We need to find which .bin file actually contains PL11089 content. This might require:

```sql
-- Get ALL nodes that reference PL11089 in ANY property
SELECT 
    n.id,
    n.uuid,
    cu.content_url,
    np.string_value,
    q.local_name
FROM LRSAdmin.alf_node n
LEFT JOIN LRSAdmin.alf_node_properties np ON np.node_id = n.id
LEFT JOIN LRSAdmin.alf_qname q ON q.id = np.qname_id
LEFT JOIN LRSAdmin.alf_content_data cd ON cd.id = n.id
LEFT JOIN LRSAdmin.alf_content_url cu ON cu.id = cd.content_url_id
WHERE np.string_value LIKE '%PL11089%'
AND n.node_deleted = 0;
```

## 🔧 Temporary Fix Options

### Option A: Disable the Workaround

If the workaround is making things worse, we can temporarily disable it:

```python
# In aumentum_browser_service.py, comment out the ASSOCIATION_FIXES:

# ASSOCIATION_FIXES = {
#     'PL11089': { ... },
#     'PL689': { ... }
# }
ASSOCIATION_FIXES = {}  # Disabled until we understand the full picture
```

This will revert to the original behavior:
- PL11089 → shows PL689 content (original problem)
- PL689 → shows PL689 content (correct)
- BP102 → shows BP102 content (correct)

### Option B: Find and Map All Correct Files

We need to:
1. Identify which .bin file contains PL11089 content
2. Identify which .bin file contains PL689 content  
3. Identify which .bin file contains BP102 content
4. Update the workaround with correct mappings

### Option C: Manual File Inspection

Convert each suspect file to PDF and manually check:

```bash
# If you have access to the contentstore, you can manually convert files
# to see what's inside

# Check the file currently tagged as PL11089
# /mnt/aumentum_contentstore/contentstore/2015/3/26/15/8/3eee6f3f-0b98-41b9-a6cb-2c4488152fed.bin
# → Should show PL689 based on earlier diagnostic

# Check the file currently tagged as PL689
# /mnt/aumentum_contentstore/contentstore/2015/3/17/10/10/879dcd53-f552-4e82-858f-7e868e60a275.bin
# → Apparently shows BP102!

# Find what file BP102 is tagged with, and check its content
```

## 📋 Action Plan

### Immediate Actions:

1. **Run the diagnostic script**:
   ```bash
   ./check_pl11089_current_state.sh
   ```

2. **Check the PDF** to confirm it shows BP102 content

3. **Check server console** for the workaround messages

4. **Report findings**:
   - What document number is shown in the PDF?
   - What does the server console say?
   - Any errors?

### Short-term Fix:

**Disable the workaround** until we can map all files correctly:

```bash
# Edit aumentum_browser_service.py line 808
# Change ASSOCIATION_FIXES to empty dict
```

### Long-term Solution:

1. **Comprehensive audit** of all PL, BP document associations
2. **Manual file inspection** to identify correct mappings
3. **Database correction** or complete workaround mapping
4. **Data quality process** to prevent future mislabeling

## 🚨 Critical Question

**Is this a data entry error or a systematic problem?**

- **If isolated**: Just PL11089, PL689, BP102 were mislabeled during one scanning session
- **If systematic**: Many documents from 2015 might have wrong associations

We need to investigate the scope to determine the right fix.

## 📞 Next Steps

Please run the diagnostic and report:

```bash
cd /home/plagis/workspace/plagis_aumentum
./check_pl11089_current_state.sh
xdg-open /tmp/PL11089_current_state.pdf
```

Then tell me:
1. What document number appears in the PDF?
2. What does the server console show?
3. Should we disable the workaround or investigate further?

---

**Current Status**: ⏳ Investigating  
**Priority**: 🔴 High - multiple documents affected  
**Next Action**: Run diagnostic and report findings

