# Rowan Results Interpretation Reference ## Table of Contents 1. [Accessing Workflow Results](#accessing-workflow-results) 2. [Property Prediction Results](#property-prediction-results) 3. [Molecular Modeling Results](#molecular-modeling-results) 4. [Docking Results](#docking-results) 5. [Cofolding Results](#cofolding-results) 6. [Validation and Quality Assessment](#validation-and-quality-assessment) --- ## Accessing Workflow Results ### Basic Pattern ```python import rowan workflow = rowan.submit_pka_workflow(mol, name="test") # Wait for completion workflow.wait_for_result() # Fetch results (not loaded by default) workflow.fetch_latest(in_place=True) # Check status before accessing data if workflow.status == "completed": print(workflow.data) elif workflow.status == "failed": print(f"Failed: {workflow.error_message}") ``` ### Workflow Status Values | Status | Description | |--------|-------------| | `pending` | Queued, waiting for resources | | `running` | Currently executing | | `completed` | Successfully finished | | `failed` | Execution failed | | `stopped` | Manually stopped | ### Credits Charged ```python # After completion print(f"Credits used: {workflow.credits_charged}") ``` --- ## Property Prediction Results ### pKa Results ```python workflow = rowan.submit_pka_workflow(mol, name="pKa") workflow.wait_for_result() workflow.fetch_latest(in_place=True) data = workflow.data # Macroscopic pKa strongest_acid = data['strongest_acid'] # Most acidic pKa strongest_base = data['strongest_base'] # Most basic pKa (if applicable) # Microscopic pKa (site-specific) micro_pkas = data['microscopic_pkas'] for site in micro_pkas: print(f"Site {site['atom_index']}: pKa = {site['pka']:.2f}") # Tautomer analysis tautomers = data.get('tautomer_populations', {}) for smiles, pop in tautomers.items(): print(f"{smiles}: {pop:.1%}") ``` **Interpretation:** - pKa < 0: Strong acid - pKa 0-7: Acidic - pKa 7-14: Basic - pKa > 14: Very weak acid --- ### Redox Potential Results ```python data = workflow.data oxidation_potential = data['oxidation_potential'] # V vs SHE reduction_potential = data['reduction_potential'] # V vs SHE print(f"Oxidation: {oxidation_potential:.2f} V vs SHE") print(f"Reduction: {reduction_potential:.2f} V vs SHE") ``` **Interpretation:** - Higher oxidation potential = harder to oxidize - Lower reduction potential = harder to reduce - Compare to reference compounds for context --- ### Solubility Results ```python data = workflow.data log_s = data['aqueous_solubility'] # Log10(mol/L) classification = data['solubility_class'] print(f"Log S: {log_s:.2f}") print(f"Classification: {classification}") # "High", "Medium", "Low" ``` **Interpretation:** - Log S > -1: High solubility (>0.1 M) - Log S -1 to -3: Medium solubility - Log S < -3: Low solubility (<0.001 M) --- ### Fukui Index Results ```python data = workflow.data # Per-atom reactivity indices fukui_plus = data['fukui_plus'] # Nucleophilic attack sites fukui_minus = data['fukui_minus'] # Electrophilic attack sites fukui_dual = data['fukui_dual'] # Dual descriptor # Find most reactive sites for i, (fp, fm, fd) in enumerate(zip(fukui_plus, fukui_minus, fukui_dual)): print(f"Atom {i}: f+ = {fp:.3f}, f- = {fm:.3f}, dual = {fd:.3f}") ``` **Interpretation:** - High f+ = susceptible to nucleophilic attack - High f- = susceptible to electrophilic attack - Dual > 0 = electrophilic character, Dual < 0 = nucleophilic character --- ## Molecular Modeling Results ### Geometry Optimization Results ```python data = workflow.data final_mol = data['final_molecule'] # stjames.Molecule final_energy = data['energy'] # Hartree converged = data['convergence'] print(f"Final energy: {final_energy:.6f} Hartree") print(f"Converged: {converged}") ``` --- ### Conformer Search Results ```python data = workflow.data conformers = data['conformers'] lowest_energy = data['lowest_energy_conformer'] # Analyze conformer distribution for i, conf in enumerate(conformers): rel_energy = (conf['energy'] - conformers[0]['energy']) * 627.509 # kcal/mol print(f"Conformer {i}: ΔE = {rel_energy:.2f} kcal/mol") # Boltzmann weights weights = data.get('boltzmann_weights', []) for i, w in enumerate(weights): print(f"Conformer {i}: population = {w:.1%}") ``` **Interpretation:** - Conformers within 3 kcal/mol are typically accessible at room temperature - Lowest energy conformer may not be most populated in solution - Consider ensemble averaging for properties --- ### Frequency Calculation Results ```python data = workflow.data frequencies = data['frequencies'] # cm⁻¹ ir_intensities = data['ir_intensities'] # km/mol zpe = data['zpe'] # Hartree gibbs = data['gibbs_free_energy'] # Hartree # Check for imaginary frequencies imaginary = [f for f in frequencies if f < 0] if imaginary: print(f"Warning: {len(imaginary)} imaginary frequencies") print("Structure may be a transition state or saddle point") else: print("Structure is a true minimum") # Thermochemistry at 298 K print(f"ZPE: {zpe * 627.509:.2f} kcal/mol") print(f"Gibbs free energy: {gibbs:.6f} Hartree") ``` **Interpretation:** - 0 imaginary frequencies = minimum - 1 imaginary frequency = transition state - >1 imaginary frequencies = higher-order saddle point --- ### Dihedral Scan Results ```python data = workflow.data angles = data['angles'] # degrees energies = data['energies'] # Hartree # Find barrier min_e = min(energies) max_e = max(energies) barrier = (max_e - min_e) * 627.509 # kcal/mol print(f"Rotation barrier: {barrier:.2f} kcal/mol") # Find minima import numpy as np rel_energies = [(e - min_e) * 627.509 for e in energies] for angle, e in zip(angles, rel_energies): if e < 0.5: # Near minimum print(f"Minimum at {angle}°") ``` --- ## Docking Results ### Single Docking Results ```python data = workflow.data # Docking score (more negative = better) score = data['docking_score'] # kcal/mol print(f"Docking score: {score:.2f} kcal/mol") # All poses poses = data['poses'] for i, pose in enumerate(poses): print(f"Pose {i}: score = {pose['score']:.2f} kcal/mol") # Ligand strain strain = data.get('ligand_strain', 0) print(f"Ligand strain: {strain:.2f} kcal/mol") # Download poses workflow.download_sdf_file("docked_poses.sdf") ``` **Interpretation:** - Vina scores typically -12 to -6 kcal/mol for drug-like molecules - More negative = stronger predicted binding - Ligand strain > 3 kcal/mol suggests unlikely binding mode --- ### Batch Docking Results ```python data = workflow.data results = data['results'] for r in results: smiles = r['smiles'] score = r['best_score'] strain = r.get('ligand_strain', 0) print(f"{smiles[:30]}: score = {score:.2f}, strain = {strain:.2f}") # Sort by score sorted_results = sorted(results, key=lambda x: x['best_score']) print("\nTop 10 hits:") for r in sorted_results[:10]: print(f"{r['smiles']}: {r['best_score']:.2f}") ``` **Scoring Function Differences:** - **Vina**: Original scoring function - **Vinardo**: Updated parameters, often more accurate --- ## Cofolding Results ### Protein-Ligand Complex Prediction ```python data = workflow.data # Confidence scores ptm = data['ptm_score'] # Predicted TM score (0-1) interface_ptm = data['interface_ptm'] # Interface confidence aggregate = data['aggregate_score'] # Combined score print(f"Predicted TM score: {ptm:.3f}") print(f"Interface pTM: {interface_ptm:.3f}") print(f"Aggregate score: {aggregate:.3f}") # Download structure pdb_content = data['structure_pdb'] with open("complex.pdb", "w") as f: f.write(pdb_content) ``` **Confidence Score Interpretation:** | Score Range | Confidence | Recommendation | |-------------|------------|----------------| | > 0.8 | High | Likely accurate | | 0.5 - 0.8 | Moderate | Use with caution | | < 0.5 | Low | Validate experimentally | --- ### Interpreting Low Confidence Low confidence may indicate: - Novel protein fold not well-represented in training data - Flexible or disordered regions - Unusual ligand (large, charged, or complex) - Multiple possible binding modes **Recommendations for low confidence:** 1. Try multiple models (Chai-1, Boltz-1, Boltz-2) 2. Compare predictions across models 3. Use docking for binding pose refinement 4. Validate with experimental data if available --- ## Validation and Quality Assessment ### Cross-Validation with Multiple Methods ```python import rowan import stjames mol = stjames.Molecule.from_smiles("c1ccccc1O") # Run with different methods results = {} for method in ['gfn2_xtb', 'aimnet2']: wf = rowan.submit_basic_calculation_workflow( initial_molecule=mol, workflow_type="optimization", workflow_data={"method": method}, name=f"opt_{method}" ) wf.wait_for_result() wf.fetch_latest(in_place=True) results[method] = wf.data['energy'] # Compare energies for method, energy in results.items(): print(f"{method}: {energy:.6f} Hartree") ``` ### Consistency Checks ```python # For pKa def validate_pka(data): pka = data['strongest_acid'] # Check reasonable range if pka < -5 or pka > 20: print("Warning: pKa outside typical range") # Compare with known references # (implementation depends on reference data) # For docking def validate_docking(data): score = data['docking_score'] strain = data.get('ligand_strain', 0) if score > 0: print("Warning: Positive docking score suggests poor binding") if strain > 5: print("Warning: High ligand strain - binding mode may be unrealistic") ``` ### Experimental Validation Guidelines | Property | Validation Method | |----------|-------------------| | pKa | Potentiometric titration, UV spectroscopy | | Solubility | Shake-flask, nephelometry | | Docking pose | X-ray crystallography, cryo-EM | | Binding affinity | SPR, ITC, fluorescence polarization | | Cofolding | X-ray, NMR, HDX-MS | --- ## Common Issues and Solutions ### Issue: Workflow Failed ```python if workflow.status == "failed": print(f"Error: {workflow.error_message}") # Common causes: # - Invalid SMILES # - Molecule too large # - Convergence failure # - Credit limit exceeded ``` ### Issue: Unexpected Results 1. **pKa off by >2 units**: Check tautomers, ensure correct protonation state 2. **Docking gives positive scores**: Ligand may not fit binding site 3. **Optimization not converged**: Try different starting geometry 4. **High strain energy**: Conformer may be wrong ### Issue: Missing Data Fields ```python # Use .get() with defaults energy = data.get('energy', None) if energy is None: print("Energy not available") ``` --- ## Data Export Patterns ### Export to CSV ```python import pandas as pd # Collect results from multiple workflows results = [] for wf in workflows: wf.fetch_latest(in_place=True) if wf.status == "completed": results.append({ 'name': wf.name, 'pka': wf.data.get('strongest_acid'), 'credits': wf.credits_charged }) df = pd.DataFrame(results) df.to_csv("results.csv", index=False) ``` ### Export Structures ```python # Download SDF with all poses workflow.download_sdf_file("poses.sdf") # Download trajectory (for MD) workflow.download_dcd_files(output_dir="trajectories/") ```